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Be il known that Leslie B. Vosshall, et al. 



have invented certain new and useful improvements in . 

GENES ENCODING INSECT ODORANT RECEPTORS AND USES THEREOF 



of which the following is a full, clear and exact description. 



nroras ENCODING INSECT ODORANT receptors and uses thereof 
This application claims priority and is a continuation- in- 
5 part application of U.S. Serial No. 09/257,706, filed 
February 25, 1999, the contents of which is hereby 
incorporated by reference. 

The invention disclosed herein was made with Government 
10 support under NIH:NIMH, 5P50, MH50733-05 and the NINDS, 
NS29832-07 from the Department of Health and Human Services. 
Accordingly, the U.S. Government has certain rights in this 
invention. 

15 Throughout this application, various publications are 
referred to by arabic numeral within parentheses. Full 
citations for these publications are presented immediately 
before the claims. Disclosures of these publications in 
their entireties are hereby incorporated by reference into 

20 this application in order to more fully describe the state 
of the art to which this invention pertains. 

BACKGROUND OF THE INVENTION 

All animals possess a "nose," an olfactory sense organ- that 
25 allows for the recognition and discrimination of chemosensory 
information in the environment. Humans, for example, are 
thought to recognize over 10,000 discrete odors with 
exquisite discriminatory power such that subtle differences 
in chemical structure can often lead to profound differences 
30 in perceived odor quality. What mechanisms have evolved to 
allow the recognition and discrimination of complex olfactory 
information and how is olfactory perception ultimately 
translated into appropriate behavioral responses? The 
recognition of odors is accomplished by odorant receptors 
35 that reside on olfactory cilia, a specialization of the 
dendrite of the olfactory sensory neuron. The odorant 
receptor genes encode novel serpentine receptors that 
traverse the membrane seven times. In several vertebrate 
species, and in the invertebrate Caenorhabdit is elegans, as 
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many as 10 00 genes encode odorant receptors, suggesting that 
1-5% of the coding potential of the genome in these organisms 
is devoted to the recognition of olfactory sensory stimuli 
(Buck and Axel, 1991; Levy et al . , 1991; Parmentier et al . , 
5 1992 ; Ben-Arie et al . , 1994; Troemel et al . , 1995; Sengupta 
et al . , 1996; Robertson, 1998). Thus, unlike color vision in 
which three photoreceptors can absorb light across the entire 
visible spectrum, these data suggest that a small number of 
odorant receptors are insufficient to recognize the full 
10 spectrum of distinct molecular structures perceived by the 
olfactory system. Rather, the olfactory sensory system 
employs an extremely large number of receptors, each capable 
of recognizing a small number of odorous ligands. 

15 The discrimination of olfactory information requires that the 
brain discern which of the numerous receptors have been 
activated by an odorant. In mammals, individual olfactory 
sensory neurons express only one of a thousand receptor genes 
such that the neurons are functionally distinct (Ngai et al . , 

70 1993; Ressler et al . , 1993 ; Vassar et al . , 1993; Chess et 
al . , 1994; Dulac and Axel, unpublished). The axons from 
olfactory neurons expressing a specific receptor converge 
upon two spatially invariant glomeruli among the 1800 
glomeruli within the olfactory bulb (Ressler et al . , 1994; 

25 Vassar et al . , 1994; Mombaerts et al . , 1996; Wang et al . , 
1998). The bulb therefore provides a spatial map that 
identifies which of the numerous receptors has been activated 
within the sensory epithelium. The quality of an olfactory 
stimulus would therefore be encoded by specific combinations 

30 of glomeruli activated by a given odorant. 

The logic of olfactory discrimination is quite different in 
the nematode, C. elegans. Despite the large size of the 
odorant receptor gene family, volatile odorants are 
35 recognized by only three pairs of chemosensory cells each 
likely to express a large number of receptor genes (Bargmann 
and Horvitz, 1991; Colbert and Bargmann, 1995; Troemel et 
al . , 1995) . Activation of any one of the multiple receptors 



in one cell will lead to chemoattraction, whereas activation 
of receptors in a second cell will result in chemorepulsion 
(Troemel et al . , 1997). The specific neural circuit activated 
by a given sensory neuron is therefore the determinant of the 
5 behavioral response. Thus, this invertebrate olfactory 
sensory system retains the ability to recognize a vast array 
of odorants but has only limited discriminatory power. 

Vertebrates create an internal representation of the external 

10 olfactory world that must translate stimulus features into 
neural information. Despite the elucidation of a precise 
spatial map, it has been difficult in vertebrates to discern 
how this information is decoded to relate the recognition of 
odors to specific behavioral responses. Genetic analysis of 

15 olfactory-driven behavior in invertebrates may ultimately 
afford a system to understand the mechanistic link between 
odor recognition and behavior. Insects provide an attractive 
model system for studying the peripheral and central events 
in olfaction because they exhibit sophisticated 

20 olfactory-driven behaviors under control of an olfactory 
sensory system that is significantly simpler anatomically 
than that of vertebrates (Siddiqi, 1987; Carlson, 1996) . 
Olfactory-based associative learning, for example, is robust 
in insects and results in discernible modifications in the 

25 neural representation of odors in the brain (Faber et al . , 
1998) . It may therefore be possible to associate 
modifications in defined olfactory connections with in vivo 
paradigms for learning and memory. 

Olfactory recognition in the fruit fly Drosophila is 
30 accomplished by sensory hairs distributed over the surface 
of the third antennal segment and the maxillary palp. 
Olfactory neurons within sensory hairs send projections to 
one of 43 glomeruli within the antennal lobe of the brain 
(Stocker, 1994; Laissue et al , 1999). The glomeruli are 
3 5 innervated by dendrites of the projection neurons, the insect 
equivalent of the mitral cells in the vertebrate olfactory 
bulb, whose cell bodies surround the glomeruli. These 
antennal lobe neurons in turn project to the mushroom body 
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and lateral horn of the protocerebrum (reviewed in Stocker, 
1994) . 2-deoxyglucose mapping in the fruit fly (Rodrigues, 
1988) and calcium imaging in the honeybee (Joerges et al . , 
1997; Faber et al . , 1998) demonstrate that different odorants 
5 elicit defined patterns of glomerular activity, suggesting 
that in insects as in vertebrates, a topographic map of odor 
quality is represented in the antennal lobe. However, in the 
absence of the genes encoding the receptor molecules, it has 
not been possible to define a physical basis for this spatial 
1 o map . 

In this study, we identify a large family of genes that are 
likely to encode theodorant receptors of Drosophila 
melanogaster . Difference cloning, along with analysis of 

15 Drosophila genomic sequences, has led to the identification 
of a novel family of putative seven transmembrane domain 
receptors likely to be encoded by 100 to 200 genes within the 
Drosophila genome. Each receptor is expressed in a small 
subset of sensory cells (0.5-1.5%) that is spatially defined 

20 within the antenna and maxillary palp. Moreover, different 
neurons express distinct complements of receptor genes such 
that individual neurons are functionally distinct. 
Identification of a large family of putative odorant 
receptors in insects indicates that, as in other species, the 

25 diversity and specificity of odor recognition is accommodated 
by a large family of receptor genes. The identification of 
the family of putative odorant receptor genes may afford 
insight into the logic of olfactory perception, in Drosophila. 

3 0 Insects provide an attractive system for the study of 
olfactory sensory perception. We have identified a novel 
family of seven transmembrane domain proteins, encoded by 100 
to 200 genes, that is likely to represent the family of 
Drosophila odorant receptors. Members of this gene family are 

35 expressed in topographically defined subpopulations of 
olfactory sensory neurons in either the antenna or the 
maxillary palp. Sensory neurons express different complements 
of receptor genes, such that individual neurons are 



functionally distinct. The isolation of -candidate odorant 
receptor genes along with a genetic analysis of 
olfactory -driven behavior in insects may ultimately afford 
a system to understand the mechanistic link between odor 
recognition and behavior. 



gTTMMRRY O F THE INVENTION 

This invention provides an isolated nucleic acid molecule 
encoding an insect odorant receptor. In an embodiment, the 
isolated nucleic acid molecule comprise: (a) one of the 
5 nucleic acid sequences as set forth in Figure 8, (b) a 
sequence being degenerated to a sequence of (a) as a result 
of the genetic code; or (c) a sequence encoding one of the 
amino acid sequences as set forth in Figure 8. 

10 This invention provides a nucleic acid molecule of at least 
12 nucleotides capable of specifically hybridizing with the 
sequence of the above -de scribed nucleic acid molecule. This 
invention provides a vector which comprises the above - 
described isolated nucleic acid molecule. In another 

15 embodiment, the vector is a plasmid. 

This invention also provides a host vector system for the 
production of a polypeptide having the biological activity 
of an insect odorant receptor which comprises the above 
20 described vector and a suitable host. 

This invention provides a method of producing a polypeptide 
having the biological activity of an insect odorant receptor 
which comprising growing the above described host vector 
25 system under conditions permitting production of the 
polypeptide and recovering the polypeptide so produced. 

This invention also provides a purified, insect odorant 
receptor. This invention further provides a polypeptide 
30 encoded by the above-described isolated nucleic acid 
molecule . 

This invention provides an .antibody capable of specifically 
binding to an insect odorant receptor. This invention also 
3 5 provides an antibody capable of competitively inhibiting the 
binding of the antibody capable of specifically binding to 
an insect odorant receptor. 
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This invention provides a method for identifying cDNA inserts 
encoding an insect odorant receptors comprising: (a) generating 
a cDNA library which contains clones carrying cDNA inserts 
from antennal or maxillary palp sensory neurons; (b) 
5 hybridizing nucleic acid molecules of the clones from the 
cDNA libraries generated in step (a) with probes prepared 
from the antenna or maxillary palp neurons and probes from 
heads lacking antenna or maxillary palp neurons or from 
virgin female body tissue; (c) selecting clones which 
10 hybridized with probes from the antenna or maxillary palp 
neurons but not from head lacking antenna or maxillary palp 
neurons or virgin female body tissue; and (d) isolating 
clones which carry the hybridized inserts, thereby 
identifying the inserts encoding odorant receptors. 

15 

This invention also provides cDNA inserts identified by the 
above method. 

This invention further provides a method for identifying DNA 
20 inserts encoding an insect odorant receptors comprising : (a) 
generating DNA libraries which contain clones carrying 
inserts from a sample which contains at least one antennal 
or maxillary palp neuron,- (b) contacting clones from the cDNA 
libraries generated in step (a) with nucleic acid molecule 

2 5 capable of specifically hybridizing with the sequence which 

encodes an insect odorant receptor in appropriate conditions 
permitting the hybridization of the nucleic acid molecules 
of the clones and the nucleic acid molecule; (c) selecting 
clones which hybridized with the nucleic acid molecule; and 

3 0 (d) isolating the clones which carry the hybridized inserts, 

thereby identifying the inserts encoding the odorant 
receptors . 

This invention also provides a method to identify DNA inserts 
3 5 encoding an insect odorant receptors comprising: 

(a) generating DNA libraries which contain clones with 
inserts from a sample which contains at least one antenna or 
maxillary palp sensory neuron,- (b) contacting the clones from 
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the DNA libraries generated in step (a) with appropriate 
polymerase chain reaction primers capable of specifically- 
binding to nucleic acid molecules encoding odorant receptors 
in appropriate conditions permitting the amplification of the 
5 hybridized inserts by polymerase chain reaction; (c) 
selecting the amplified inserts; and (d) isolating the 
amplified inserts, thereby identifying the inserts encoding 
the odorant receptors. 

10 This invention also provides a method to isolate DNA 
molecules encoding insect odorant receptors comprising: (a) 
contacting a biological sample known to contain nucleic acids 
with appropriate polymerase chain reaction primers capable 
of specifically binding to nucleic acid molecules encoding 

15 insect odorant receptors in appropriate conditions permitting 
the amplification of the hybridized molecules by polymerase 
chain reaction; (b) isolating the amplified molecules, 
thereby identifying the DNA molecules encoding the insect 
odorant receptors . 

20 

This invention also provides a method of transforming cells 
which comprises transfecting a host cell with a suitable 
vector described above. This invention also provides 
transformed cells produced by the above method. 

25 

This invention provides a method of identifying a compound 
capable of specifically bind to an insect odorant receptor 
which comprises contacting a transfected cells or membrane 
fractions of the above described transfected cells with an 
30 appropriate amount of the compound under conditions 
permitting binding of the compound to such receptor, 
detecting the presence of any such compound specifically 
bound to the receptor, and thereby determining whether the 
compound specifically binds to the receptor. 

35 

This invention provides a method of identifying a compound 
capable of specifically binding to an insect odorant receptor 
which comprises contacting an appropriate amount of the 
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purified insect odorant receptor with an appropriate amount 
of the compound under conditions permitting binding of the 
compound to such purified receptor, detecting the presence 
of any such compound specifically bound to the receptor, and 
5 thereby determining whether the compound specifically binds 
to the receptor. 

This invention also provides a method of identifying a 
compound capable of activating the activity of an insect 

10 odorant receptor which comprises contacting the transfected 
cells or membrane fractions of the above -described 
transfected cells with the compound under conditions 
permitting the activation of a functional odorant receptor 
response, the activation of the receptor indicating that the 

15 compound is capable of activating the activity of a odorant 
receptor. 

This invention also provides a method of identifying a 
compound capable of activating the activity of an odorant 

2 0 receptor which comprises contacting a purified insect odorant 

receptor with the compound under conditions permitting the 
activation of a functional odorant receptor response, the 
activation of the receptor indicating that the compound is 
capable of activating the activity of a odorant receptor. 
25 In an embodiment, the purified receptor is embedded in a 
lipid bilayer. 

This invention also provides a method of identifying a 
compound capable of inhibiting the activity of a odorant 

3 0 receptor which comprises contacting the transfected cells or 

membrane fractions of the above -described transfected cells 
with an appropriate amount of the compound under conditions 
permitting the inhibition of a functional odorant receptor 
response, the inhibition of the receptor response indicating 
3 5 that the compound is capable of inhibiting the activity of 
a odorant receptor . 
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This invention provides a method of identifying a compound 
capable of inhibiting the activity of a odorant receptor 
which comprises contacting an appropriate amount of the 
purified insect odorant receptor with an appropriate amount 
5 of the compound under conditions permitting the inhibition 
of a functional odorant receptor response, the inhibition of 
the receptor response indicating that the compound is capable 
of activating the activity of a odorant receptor. In an 
embodiment, the purified receptor is embedded in a lipid 
10 bilayer. 

This invention also provides the compound identified by the 
above -described methods. 

15 This invention provides a method of controlling pest 
populations which comprises identifying odorant ligands by 
the above -de scribed method which are alarm odorant ligands 
and spraying the desired area with the identified odorant 
ligands. 

20 

Finally, this invention provides a method of controlling a 
pest population which comprises identifying odorant ligands 
by the above-described method which interfere with the 
interaction between the odorant ligands and the odorant 
25 receptors which are associated with fertility. 
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BRTEF DESCRIPTION OF FIGURES 

FIGURE 1 Identification o£ Rare Antennal- and Maxillary- 
Palp- Specif ic Genes 

Candidate antennal/maxillary palp- specif ic phage 
were subjected to in vivo excision, digestion of 
resulting pBLUESCRIPT plasmid DNAs with 
BamHI/Asp7l8 , and electrophoresis on 1.5% agarose 
gels. Southern blots were hybridized with 
32 P- labeled cDNA probes generated from 
antennal/maxillary palp mRNA (Panel A) , head minus 
antennal/maxillary palp mRNA (Panel B) , or virgin 
female body mRNA {Panel C) . The ethidium bromide 
stained gel is shown in Panel D. Of the thirteen 
clones displayed in this figure, four appear to be 
antennal/maxillary palp specific (lanes 5, 7, 9, 
and 11) . However, only two are selectively 
expressed in subsets of cells in chemosensory 
organs of the adult fly. DOR104, a putative 
maxillary palp odorant receptor, is in Lane 9. The 
clone in Lane 11 (RN106) is homologous to 
lipoprotein and triglyceride lipases and is 
expressed in a restricted domain in the antenna 
(data not shown) . 



FIGURE 2 Expression of DOR104 in a Subset of Maxillary Palp 
Neurons 

(A) A frontal section of an adult maxillary palp 
was hybridized with a digoxigenin- labeled 
antisense RNA probe and visualized with 
anti -digoxigenin conjugated to alkaline 
phosphatase. Seven cells expressing DOR104 are 
visible in this 15 ^m section, which represents 
about one third of the diameter of the maxillary 
palp. Serial sections of multiple maxillary palps 
were scored for DOR104 expression and on average 
20 cells per maxillary palp are positive for this 
receptor . 
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(B) Transgenic flies carrying a DOR104-lacZ 
reporter transgene were stained with X-GAL in a 
whole mount preparation. Maxillary palps were 
dissected from the head and viewed in a flattened 

5 cover slipped preparation under Nomarski optics, 

which allows the visualization of all 20 cells 
expressing DOR104-lacZ. 

(C) Dendrites and axons of neurons expressing 
DOR104-lacZ are visible in this horizontal section 

10 of a maxillary palp. LacZ expression was 

visualized with a polyclonal anti-S-galactosidase 
primary antibody and a CY3- conjugated secondary 
antibody. Sections were viewed under 
epif luorescence and photographed on black and 

15 white film. 

figure 3 Predicted Amino Acid Sequences of Drosopnila 
Odorant Receptor Genes 

Deduced amino acid sequences of 12 DOR genes are 

20 aligned using ClustalW (MacVector, Oxford 

Molecular) . Predicted positions of transmembrane 
regions (I -VII) are indicated by bars above the 
alignment. Amino acids identities are marked with 
dark shading and similarities are indicated with 

25 • light shading. Protein sequences of DOR87, 53, 67, 

104, and 64 were derived from cDNA clones. All 
others were derived from GENSCAN predictions of 
intron-exon arrangements in genomic DNA, as 
indicated by the letter "g" after the gene name. We 

3 0 obtained a partial cDNA clone for DOR62 and found 

it to be 100% identical to the GENSCAN protein in 
the region of amino acids 245-381. A 40 amino acid 
extension for DOR 19 was predicted by GENSCAN 
analysis. This has been replaced with an asterisk 

35 in the alignment, and isolation of cDNA clones for 

this receptor will resolve whether this extension 
is physically present in the protein. 



FIGURE 4 Receptor Gene Expression in Spatially Restricted 
Regions of the Antenna 

Digoxigenin- labeled antisense RNA probes against 
8 DOR genes each hybridize to a small number of 
cells distributed in distinct regions in the 
antenna. The total number of cells per antenna 
expressing a given receptor was obtained by 
counting positive cells in serial sections of 
multiple antennae. There are approximately 20 
positive cells per antenna for DOR67 (A) , 53 (B) , 
and 24 (data not shown) ; 15 positive cells for 
DOR62 (C) and 87 (D) ; and 10 positive cells for 
DOR64 (E) . The actual number of cells staining in 
these sections is a subset of this total number. 
With the exception of DOR53 and DOR67, which 
strongly cross - hybridi ze , the receptor genes 
likely identify different olfactory neurons, such 
that the number of cells staining with a mixed 
probe (F) is equal to the sum of those staining 
with the individual probes (A-E) . The mixture of 
DOR53, 67, 62, 87 and 64 labels a total of about 
60 cells per antenna. A total of 34 cells stain 
with the mixed probe in this 15 ^m section. 
Expression of the linked genes D0R71, DOR72 , and 
DOR73 is shown in panels (G) , (K) , and (I) , 
respectively. DOR71 is expressed in approximately 
10 cells in the maxillary palp. Five positive 
cells are seen in the horizontal section in panel 
(G) . We also examined the expression of the other 
members of this linkage group and found DOR72 in 
approximately 15 cells {of which 3 label in this 
section) (H) and DOR73 in 1 to 2 cells per antenna 
(I) - 



FIGURE 5 Odorant Receptors are Restricted to Distinct 
Populations of Olfactory Neurons 
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(A-C) Flies of the C155 ela V-GAL4; UAS-lacZ 
genotype express cytoplasmic lacZ in all neuronal 
cells. Panels (A-C) show confocal images of a 
horizontal maxillary palp section from such a fly 
5 incubated with an ant i sense RNA probe against 

DOR104 (red) and anti-S-galactosidase antibody 
(green) . DOR104 recognizes five cells in this 
maxillary palp section (A) , all of which also 
express elav-lacZ (B) , as demonstrated by the 

10 yellow cells in the merged image in panel ®) . 

(D, E) D0R64 and DOR87 are expressed in 
non- overlapping neurons at the tip of the antenna. 
Antisense RNA probes for DOR64 (digoxigenin-RNA; 
red) and DOR87 (FITC-RNA; green) were annealed to 

15 the same antennal sections and viewed by confocal 

microscopy. Panel (D) is a digital superimposition 
of confocal images taken at 0.5 fzm intervals 
through a 10 /xm section of the antenna. Cells at 
different focal planes express both receptors, but 

2 0 no double labeled cells are found. 

(F, G) Two color RNA in situ hybridization with 
odorant receptors and odorant binding proteins 
demonstrates that these proteins are expressed in 
different populations of cells. DOR53 (FITC-RNA; 

25 green) labels a few cells internal to the cuticle 

at the proximal -medial edge, while PBPRP2 
(digoxigenin-RNA; red) labels a large number of 
cells apposed to the cuticle throughout the 
antenna (F) . The more restricted odorant binding 

30 protein OS-F (digoxigenin-RNA; red) also stains 

cells distinct from those expressing DOR67 
(FITC-RNA; green) (G) . 

FIGURE 6 Receptor Expression is Conserved Between 
35 Individuals 

Frontal sections of antennae from six different 
individuals were hybridized with 
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digoxigenin-labeled antisense RNA probes against 
DOR53 (A-C) or DOR87 (D-F) . DOR53 labels 
approximately 20 cells on the proximal -medial edge 
of the antenna, of which approximately 5 are shown 
5 labeling in these sections. DOR87 is expressed in 

about the same number of cells at the distal tip. 
Both the position and number of staining cells is 
conserved between different individuals and is not 
sexually dimorphic. 

10 

FIGURE 7 Drosophila Odorant Receptors are Highly Divergent 

Oregon R genomic DNA isolated from whole flies was 
digested with BamHI (B) , EcoRI (E) , or Hindlll 
(H) , electrophoresed on 0.8% agarose gels, and 
15 blotted to nitrocellulose membranes. Blots were 

annealed with 32 P- labeled probes derived from DOR53 
cDNA (A) , DOR67 cDNA (B) , or DNA fragments 
generated by RT-PCR from antennal mRNA for DOR 24 
(C) , DOR62 (D) , and DOR72 {E) . Strong 

2 0 crosshybridization of DOR53 and DOR67 is seen at 

both high and low stringency (A, B) , while D0R24, 
62, and 72 reveal only a single hybridizing band 
in each lane at both low stringency (C-E) and high 
stringency (data not shown) . 

25 

FIGURE 8 DOR 62, 104, 87, 53, 67, 64, 71g, 72g, 73g, 46, 
19g, and 24g 

Both nucleic acid sequence of each' DOR and its 
encoded amino acid sequence are described. 

30 

FIGURE 9 Analysis of axonal projections of olfactory 
receptor neurons expressing a given Drosophila 
odorant receptor. Result: all neurons expressing 
a given receptor send their axons to a single 

3 5 glomerulus, or discrete synaptic structure, in the 

olfactory processing center of the fly brain. 
This result is identical to that obtained with 
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mouse odorant receptors: each glomerulus is 
dedicated to receiving axonal input from neurons 
expressing a given odorant receptor. Therefore, 
this result strengthens the argument that these 
5 genes indeed function as odorant receptors in 

Drosophila . 



FIGURE 10 ClustalW alignments of two subfamilies of the 
nrosophila odorant receptors, the DOR53 (A-l and 

10 A-2) and DOR64 (B) families. This figure 

highlights sequence similarities between DOR 
genes, that are diagnostic hallmarks of the 
proteins. Residues that are identical in 

different DOR genes are highlighted in black, 

15 while residues that are similar are highlighted in 

gray. 
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DKTATLED DESCRIPTION OF THE INVENTION 



In order to facilitate an understanding of the Experimental 
Procedures section which follow, certain frequently occurring 
methods and/or terms are described in Sambrook, et al . 
5 (1989) - 

Throughout this application, the following standard 
abbreviations are used throughout the specification to 
indicate specific nucleotides: 
10 C=cytosine A=adenosine 

T= thymidine G=guanosine 

This invention provides an isolated nucleic acid molecule 
encoding an insect odorant receptor. The nucleic acid 
15 includes but is not limited to DNA, cDNA, genomic DNA, 
synthetic DNA or RNA. In an embodiment, the nucleic acid 
molecule encodes a Drosophila odorant receptor. 

In a further embodiment, the isolated nucleic acid molecule 
20 comprise: (a) one of the nucleic acid sequences as set forth 
in Figure 8 , (b) a sequence being degenerated to a sequence 
of (a) as a result of the genetic code; or (c) a sequence 
encoding one of the amino acid sequences as set forth in 
Figure 8 . 

25 

The nucleic acid molecules encoding a insect receptor 
includes molecules coding for polypeptide analogs, fragments 
or derivatives of antigenic polypeptides which differ from 
naturally-occurring forms in terms of the identity or 

3 0 location of one or more amino acid residues (deletion analogs 
containing less than all of the residues specified for the 
protein, substitution analogs wherein one or more residues 
specified are replaced by other residues and addition analogs 
where in one or more amino acid residues is added to a 

35 terminal or medial portion of the polypeptides) and which 
share some or all properties of naturally-occurring forms. 
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These molecules include but not limited to: the 
incorporation of codons "preferred" for expression by 
selected non- mammalian hosts; the provision of sites for 
cleavage by restriction endonuclease enzymes; and the 
5 provision of additional initial, terminal or intermediate 
sequences that facilitate construction of readily expressed 
vectors. Accordingly, these changes may result in a modified 
insect odorant receptor. It is the intent of this invention 
to include nucleic 'acid molecules which encodes modified 

10 insect odorant receptor. Also, to facilitate the expression 
of receptor in different host cells, it may be necessary to 
modify the molecule such that the expressed receptors may 
reach the surface of the host cells. The modified insect 
odorant receptor should have biological activities similar 

15 to the unmodified insect odorant receptor. The molecules may 
also be modified to increase the biological activity of the 
expressed receptor. 

This invention provides a nucleic acid molecule of at least 
20 12 nucleotides capable of specifically hybridizing with the 
sequence of the above -described nucleic acid molecule. In 
an embodiment, the nucleic acid molecule hybridizes with a 
unique sequence within the sequence of the above -described 
nucleic acid molecule. This nucleic acid molecule may be 
25 DNA, cDNA, genomic DNA, synthetic DNA or RNA. 

This invention provides a vector which comprises the above- 
described isolated nucleic acid molecule. In another 
embodiment, the vector is a plasmid. 

30 

In an embodiment, the above described isolated nucleic acid 
molecule is operatively linked to a regulatory element. 

Regulatory elements required for expression include promoter 
3 5 sequences to bind RNA polymerase and transcription initiation 
sequences for ribosome binding. For example, a bacterial 
expression vector includes a promoter such as the lac 
promoter and for transcription initiation the Shine-Dalgamo 
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sequence and the start codon AUG. Similarly, a eukaryotic 
expression vector includes a heterologous or homologous 
promoter for RNA polymerase II, a downstream polyadenylation 
signal, the start codon AUG, and a termination codon for 
5 detachment of the ribosome. Such vectors may be obtained 
commercially or assembled from the sequences described by 
methods well-known in the art, for example the methods 
described above for constructing vectors in general. 

10 This invention also provides a host vector system for the 
production of a polypeptide having the biological activity 
of an insect odorant receptor which comprises the above 
described vector and a suitable host. 

15 This invention also provides a host vector system, wherein 
the suitable host is a bacterial cell, yeast cell, insect 
cell, or animal cell. The host cell of the above expression 
system may be selected from the group consisting of the cells 
where the protein of interest is normally expressed, or 

20 foreign cells such as bacterial cells (such as E. coli) , 
yeast cells, fungal cells, insect cells, nematode cells, 
plant or animal cells, where the protein of interest is not 
normally expressed. Suitable animal cells include, but are 
not limited to Vero cells, HeLa cells, Cos cells, CV1 cells 

25 and various primary mammalian cells. 



This invention provides a method of producing a polypeptide 
having the biological activity of an insect odorant receptor 
3 0 which comprising growing the above described host vector 
system under conditions permitting production of the 
polypeptide and recovering the polypeptide so produced. 

This invention also provides a purified, insect odorant 
3 5 receptor. This invention further provides a polypeptide 
encoded by the above-described isolated nucleic acid 
molecule . 
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This invention provides an antibody capable of specifically 
binding to an insect odorant receptor. This invention also 
provides an antibody capable of competitively inhibiting the 
binding of the antibody capable of specifically binding to 
5 an insect odorant receptor. In an embodiment, the antibody 
is monoclonal. In another embodiment, the antibody is 
polyclonal . 

Monoclonal antibody directed to an insect odorant receptor 
10 may comprise, for example, a monoclonal antibody directed to 
an epitope of an insect odorant receptor present on the 
surface of a cell. Amino acid sequences may be analyzed by 
methods well known to those skilled in the art to determine 
whether they produce hydrophobic or hydrophilic regions in 
15 the proteins which they build. In the case of cell membrane 
proteins, hydrophobic regions are well known to form the part 
of the protein that is inserted into the lipid bilayer which 
forms the cell membrane, while hydrophilic regions are 
located on the cell surface, in an aqueous environment. 

20 

Antibodies directed to an insect odorant receptor may be 
serum-derived or monoclonal and are prepared using methods 
well known in the art. For example, monoclonal antibodies 
are prepared using hybridoma technology by fusing antibody 

2 5 producing B cells from immunized animals with myeloma cells 

and selecting the resulting hybridoma cell line producing the 
desired antibody. Cells such as NIH3T3 cells or 293 cells 
which express the receptor may be used as immunogens to raise 
such an antibody. Alternatively, synthetic peptides may be 
30 prepared using commercially available machines. 

As a still further alternative, DNA, such as a cDNA or a 
fragment thereof, encoding the receptor or a portion of the 
receptor may be cloned and expressed. The expressed 

3 5 polypeptide recovered and used as an immunogen. 

The resulting antibodies are useful to detect the presence 
of insect odorant receptors or to inhibit the function of the 



receptor in living animals, in huraans, or in biological 
tissues or fluids isolated from animals or humans. 



This antibodies may also be useful for identifying or 
5 isolating other insect odorant receptors. For example, 
antibodies against the Drosophila odorant receptor may be 
used to screen an cockroach expression library for- a 
cockroach odorant receptor. Such antibodies may be 
monoclonal or monospecific polyclonal antibody against a 
10 selected insect odorant receptor. Different insect 

expression libraries are readily available and may be made 
using technologies well-known in the art. 

One means of isolating a nucleic acid molecule which encodes 
15 an insect odorant receptor is to probe a libraries with a 
natural or artificially designed probes, using methods well 
known in the art. The probes may be DNA or RNA. The library 
may be cDNA or genomic DNA. 

2 0 This invention provides a method for identifying cDNA inserts 
encoding an insect odorant receptors comprising: (a) generating 
a cDNA library which contains clones carrying cDNA inserts 
from antennal or maxillary palp sensory neurons; (b) 
hybridizing nucleic acid molecules of the clones from the 

2 5 cDNA libraries generated in step (a) with probes prepared 

from the antenna or maxillary palp neurons and probes from 
heads lacking antenna or maxillary palp neurons or from 
virgin female body tissue; (c) selecting clones which 
hybridized with probes from the antenna or maxillary palp 

3 0 neurons but not from head lacking antenna or maxillary palp 

neurons or virgin female body tissue; and (d) isolating 
clones which carry the hybridized inserts, thereby 
identifying the inserts encoding odorant receptors. 

35 In an embodiment of the above method, after step (c) , it 
further comprises: (a) amplifying the inserts from the 
selected clones by polymerase chain reaction; (b) hybridizing 
the amplified inserts with probes from the antennal or 
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maxillary palp neurons; and (c) isolating the clones which 
carry the hybridized inserts, thereby identifying the inserts 
encoding the odorant receptors. 
In an embodiment, the probes are cDNA probes. 

5 

The appropriate polymerase chain reaction primers may be 
chosen from the conserved regions of the known insect odorant 
receptor sequences. Alternatively, the primers may be chosen 
from the regions which are the active sites for the binding 
10 of ligands. 

This invention also provides cDNA inserts identified by the 
above method . 

15 This invention further provides amethod for identifying DNA 
inserts encoding an insect odorant receptors comprising: (a) 
generating DNA libraries which contain clones carrying 
inserts from a sample which contains at least one antennal 
or maxillary palp neuron; (b) contacting clones from the cDNA 

20 libraries generated in step (a) with nucleic acid molecule 
capable of specifically hybridizing with the sequence which 
encodes an insect odorant receptor in appropriate conditions 
permitting the hybridization of the nucleic acid molecules 
of the clones and the nucleic acid molecule; (c) selecting 

25 clones which hybridized with the nucleic acid molecule; and 
(d) isolating the clones which carry the hybridized inserts, 
thereby identifying the inserts encoding the odorant 
receptors . 

3 0 This invention also provides a method to identify DNA inserts 
encoding an insect odorant receptors comprising: 
(a) generating DNA libraries which contain clones with 
inserts from a sample which contains at least one antenna or 
maxillary palp sensory neuron; (b) contacting the clones from 

3 5 che DNA libraries generated in step (a) with appropriate 
polymerase chain reaction primers capable of specifically 
binding to nucleic acid molecules encoding odorant receptors 
in appropriate conditions permitting the amplification of the 



-23- 



hybridized inserts by polymerase chain reaction; (c) 
selecting the amplified inserts; and (d) isolating the 
amplified inserts, thereby identifying the inserts encoding 
the odorant receptors. 

5 

This invention also provides a method to isolate DNA 
molecules encoding insect odorant receptors comprising: (a) 
contacting a biological sample known to contain nucleic acids 
with appropriate polymerase chain reaction primers capable 

10 of specifically binding to nucleic acid molecules encoding 
insect odorant receptors in appropriate conditions permitting 
the amplification of the hybridized molecules by polymerase 
chain reaction; (b) isolating the amplified molecules, 
thereby identifying the DNA molecules encoding the insect 

15 odorant receptors. 

This invention also provides a method of transforming cells 
which comprises transfecting a host cell with a suitable 
vector described above . 

20 

This invention also provides transformed cells produced by 
the above method. In an embodiment, the host cells are not 
usually expressing odorant receptors. In another embodiment, 
the host cells are expressing odorant receptors. 

25 

This invention provides a method of identifying a compound 
capable of specifically binding to an insect odorant receptor 
which comprises contacting a transfected cells or membrane 
fractions of the above described transfected cells with an 
30 appropriate amount of the compound under conditions 
permitting binding of the compound to such receptor, 
detecting the presence of any such compound specifically 
bound to the receptor, and thereby determining whether the 
compound specifically binds to the receptor. 

35 

This invention provides a method of identifying a compound 
capable of specifically bind to an insect odorant receptor 
which comprises contacting an appropriate amount of the 



purified insect odorant receptor with an - appropriate amount 
of the compound under conditions permitting binding of the 
compound to such purified receptor, detecting the presence 
of any such compound specifically bound to the receptor, and 
5 thereby determining whether the compound specifically binds 
to the receptor. In' an embodiment, the purified receptor is 
embedded in a lipid bilayer.The purified receptor may be 
embedded in the liposomes with proper orientation to carry 
out normal functions. Liposome technology is well-known in 
10 the art. 

This invention also provides a method of identifying a 
compound capable of activating the activity of an insect 
odorant receptor which comprises contacting the transfected 
15 cells or membrane fractions of the above-described 
transfected cells with the compound under conditions 
permitting the activation of a functional odorant receptor 
response, the activation of the receptor indicating that the 
compound is capable of activating the activity of a odorant 

2 0 receptor. 

This invention also provides a method of identifying a 
compound capable of activating the activity of an odorant 
receptor which comprises contacting a purified insect odorant 

25 receptor with the compound under conditions permitting the 
activation of a functional odorant receptor response, the 
activation of the receptor indicating that the compound is 
capable of activating the activity of a odorant receptor. 
In an embodiment, the purified receptor is embedded in a 

30 lipid bilayer. 

This invention also provides a method of identifying a 
compound capable of inhibiting the activity of a odorant 
receptor which comprises contacting the transfected cells or 

3 5 membrane fractions of the above -de scribed transfected cells 

with an appropriate amount of the compound under conditions 
permitting the inhibition of a functional odorant receptor 
response, the inhibition of the receptor response indicating 
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that the compound is capable of inhibiting the activity of 
a odorant receptor. 

« 

This invention provides a method of identifying a compound 
5 capable of inhibiting the activity of a odorant receptor 
which comprises contacting an appropriate amount of the 
purified insect odorant receptor with an appropriated amount 
of the compound under conditions permitting the inhibition 
of a functional odorant receptor response, the inhibition of 
10 the receptor response indicating that the compound is capable 
of activating the activity of a odorant receptor. In an 
embodiment, the purified receptor is embedded in a lipid 
bilayer. 

15 In a separate embodiment of the above method, the compound 
is not previously known. This invention also provides the 
compound identified by the above -described methods. 

This invention provides a method of controlling pest 
20 populations which comprises identifying odorant ligands by 
the above-described method which are alarm odorant ligands 
and spraying the desired area with the identified odorant 
ligands . 

25 Finally, this invention provides a method of controlling a 
pest population which comprises identifying odorant ligands 
by the above -described method which interfere with the 
interaction between the odorant ligands and the odorant 
receptors which are associated with fertility.' 

30 

This invention will be better understood from the 
Experimental Procedures which follow. However, one skilled 
in the art will readily appreciate that the specific methods 
and results discussed are merely illustrative of the 
3 5 invention as described more fully in the claims which follow 
thereafter. 



1 » T » r -ln«r tr1 B" <mals 

Oregon R fli« (Brosophila melanogaster) were raised on 
standard cor nmeal-agar-molasses medium at 25°C. Transgenic 
constructs were injected into yw embryos. C15S -l.v-OAM 
flies were obtained from Corey Goodman (Lin and Goodman, 
199 4) and Gary Struhl provided the UAS- (oytoplasmic) laeZ 
stock. 

- n ..i/ ^l^rv p alp cDNft library 

D rosophila antennae and maxillary palps were obtained by 
manually decapitating and freezing SOOO adult 
shaking antennae and maxillary palps through a fine metal 
s^e. mRHA was prepared using a polyA + RNA Purification Kit 
(S tratagene) . An antennal/maxillary palp cDNA library was 
lade from 0.5 « - using the L ambdaZAPIIXS kit from 
Stratagene . 

Briefly, Phage were plated at low density (500-1000 pfu/lSOmm 
plate, and UV-=rosslinked after lifting in triplicate to 
Hybond-N + (Amersham) . complex probes were generated by random 
p L med labeling (PrimeltH, Stratagene) of reverse 
transcribed mRNA (RT-PCR kit, Stratagene) from virgin adult 
female body mRNA and duplicate lifts hybridized at high 
stringency for 36 hours (6S°C in 0 . 5M Sodium Phosphate buffer 
[pH7 3] containing 1% bovine serum albumin, 4% SDS, and 0.5 
mg/ml herring sperm D N A, . We prescreened the third lift with 
a mix of all previously cloned OBPs/PBPs (McKenna et al . , 
19M - Pikielny et al., 1994; Kim et al . , 1998} remove a 
source of abundant but undesired olfactory-specific clones. 
Approximately 5000 individual OBP/PBP and virgin female body 
negative phage clones were isolated, their inserts amplified 
by PC* with T3 and T7 primers, and approximately 3 « of DMA 
„ere electrophoresed on 1.5% agarose gels. Gels were blotted 
in duplicate to Hybond-N + (Amersham) , filters were 
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UV-crosslinked, and the resulting Southern blots were 
subjected to reverse Northern analysis using complex probes 
generated from virgin female body mRNA. Approximately 5 00 
clones not hybridizing with virgin female body probes were 
5 identified and consolidated onto secondary Southern blots in 
triplicate. These blots were probed with complex probes 
derived from a n t e nn a 1 / max i 1 1 a r y palp, 

head-minus-antenna/maxillary palp, and virgin female body 
mRNA. A total of 210 clones negative with 
10 head-minus -antenna /maxillary palp and virgin female body 
probes and strongly positive, weakly positive, or negative 
with antennal/maxillary palp probes were further analyzed by 
sequencing and in situ hybridization. 

15 Anal ysis of Drosophila Genome Project Sequences for 
Transmembrane Proteins 

All Drosophila genomic sequences were batch downloaded in 
April 1998 from the Berkeley Drosophila Genome Project 
(Berkeley Drosophila Genome Project, unpublished) . Genomic 

2 0 PI sequences were first analyzed with the GENSCAN program 
(Burge and Karlin, 1 9 9 7 ; 

http://CCR-0 81.mit.edu/GENSCAN.html), which predicts 

intron-exon structures and generates hypothetical coding 
sequences (CDS) and open reading frames. GENSCAN predicted 

25 proteins shorter than 50 amino acids were discarded. The 
remaining open reading frames were used to search for 
putative transmembrane regions greater than 15 amino acids 
with two programs that were obtained from the authors and 
used in stand-alone mode locally (see Persson and Argos, 

30 1994; Cserzo et al . , 1997) . The Dense Surface Alignment (DAS) 
program is available at http://www.biokemi.su.se/~server/DAS/ 
or from M. Cserzo (miklos@pugh.bip.bham.ac.uk) . TMAP is 
available at ftp://ftp.ebi.ac.uk/pub/software/unix/, or by 
contacting the author, Bengt Persson (bpn@mbb.ki.se). Scripts 

35 were written to apply the DAS and TMAP programs repeatedly 
to genome scale sequence sets. Genes showing significant 
sequence similarity to the NCBI non- redundant protein 
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database using BLAST analysis (Altschul et al . , 1990; 
Altschul et al . , 1997) were eliminated. All scripts required 
for these computations were written in standard ANSI C and 
run on a SUN Enterprise 3000. 

5 

Of 22 9 novel Drosophila proteins with three or more predicted 
transmembrane spanning regions, 3 5 showed no clear sequence 
similarity to any known protein and were selected for further 
analysis by in situ hybridization. Probes for in situ 
10 hybridization were generated by RT-PCR using 
antennal/maxillary palp mRNA as a template. 



Ma p positions of DOR Genes 

The chromosome position of DOR104 was determined by in situ 
15 hybridization of a biotin- labeled probe to salivary gland 
polytene chromosome squashes as described (Ararein et al . , 
1988) . 

Chromosomal positions of all other DOR genes were based on 
2 0 chromosome assignments of the PI clones to which they map, 
as determined by the Berkeley Drosophila Genome Project 
(personal communication; http://www.fruitfly.org; see also 
Hartl et al - , 1994; Kimmerly et al . , 1996). D0R62 maps to a 
cosmid sequenced by the European Drosophila Genome Project 
25 (unpublished; http://edgp.ebi.ac.uk/; Siden-Kiamos et al . , 
1990) . 





RECEPTOR 


MAP 


POSITION 


PI CLONE ACCESSION NUMBER 




DOR 6 2 


(X) 


2F 


G2D9 (EDGP cosmid) 


30 


DORS 7 


(2L) 


22A3 


DS00676 




DORS 3 


(2L) 


22A2-3 


DS05342 




DOR64 


(2L) 


23A1-2 


DS06400 




DOR71 


(2L) 


33B1-2 


DS07071 




DOR 7 2 


(2L) 


33B1-2 


DS07071 


35 


DOR73 


(2L) 


33B1-2 


DS07071 




DOR 8 7 


(2R) 


43B1.-2 


DS08779 




DOR19 


(2R) 


46F5-6 


DS01913 




DOR24 


(2R) 


47D6-E2 


DS00724 




DOR46 


(2R) 


59D5-7 


DS07462 


40 


DOR104 


(3D 


85B 


not applicable 
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The Isolation of DOR cDNA Clones and Southern Blotting 



We screened 3xl0 6 clones of the antennal /maxillary palp 
library described above with PCR probes for the genes D0R87, 
DOR53 , DOR67, DOR64 , and DOR62 . cDNAs were present at a 
5 frequency ranging from 1:200,000 (DOR67) to 1:1,000,000 
(DOR62) in the library and their sequences were remarkably 
similar to the hypothetical CDS predicted by the GENS CAN 
program. The frequency of these genes is similar to that of 
DOR104, which is present at 1:125,000 in the 
10 antennal /maxillary palp library. All sequencing was with ABI 
cycle sequencing kits and reactions were run on an ABI 310 
or 3 77 sequencing system. 

Five of Oregon R genomic DNA isolated from whole flies 
15 were digested with BamHI, EcoRI , or Hindlll, electrophoresed 
on 0.8% agarose gels, and blotted to Nitropure nitrocellulose 
membranes (Micron Separations Inc.). Blots were baked and 
annealed with 32 P-labeled probes derived from cDNA probes of 
DOR53 and DOR67, or PCR fragments from DOR24, DORS 2 , and 
20 DOR72 . Hybridization was at 42°C for 36 hours in 5XSSCP , 10X 
Denhardts, 500 jug/ml herring sperm DNA, and either 50% (high 
stringency) or 25% (low stringency) formamide (Sambrook et 
al., 1989). Blots were washed for 1 hour in 0 . 2X SSC, 0.5% 
SDS at 65°C (high stringency) or 1XSSC, 0.5% SDS at 42°c (low 

2 5 stringency) . 

m situ Hybridization 

RNA in situ hybridization was carried out essentially as 
described (Schaeren-Wiemers and Gerf in-Moser , 1993) . This 

3 0 protocol was modified to include detergents in most steps to 

increase sensitivity and reduce background. The hybridization 
buffer contained 50% formamide, 5X SSC , 5X Denhardts, 250 
,ug/ml yeast tRNA, 500 ,ug/ml herring sperm DNA, 50 ,ug/ml 
Heparin, 2.5 mM EDTA, 0.1% Tween-20, 0.25% CHAPS. All 
3 5 antibody steps were in the presence of 0.1% Triton X-100, and 
the reaction was developed in buffer containing 0.1% 
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Tween-20. Slides were mounted in Glycergel (DAKO) and viewed 
with Nomarski optics. 

Fluorescent in situ hybridization was carried out as above 
5 with either digoxigenin or FITC labeled RNA probes. The 
digoxigenin probe was visualized with sheep ant i -digoxigenin 
(Boehringer) followed by donkey ant i- sheep CY3 (Jackson) . 
FITC probes were visualized with mouse anti-FITC (Boehringer) 
and goat anti-mouse Alexa 488 (Molecular Probes) following 
10 preincubation with normal goat serum. Sections were mounted 
in Vectashield reagent (Vector Labs) and viewed on a Biorad 
1024 Confocal Microscope. 

For double labeling with a neural marker, animals of the 
15 genotype C155 elav-Gal4; UAS-lacZ were sectioned and first 
hybridized with a digoxigenin labeled antisense DOR104 RNA 
probe and developed as described above. Neuron-specific 
expression of lacZ driven by the e!av-Gal4 enhancer trap was 
visualized with a polyclonal rabbit anti-fi-galactosidase 
20 antibody (Organon-Technika/Cappel ) , visualized by a goat 
anti-rabbit Alexa488 conjugated secondary antibody (Molecular 
Probes) following preincubation with normal goat serum. 

The proportion of neurons in the third antennal segment was 
25 calculated by comparing the number of nuclei staining with 
the 44C11 ELAV monoclonal (kindly provided by Lily Jan) and 
those staining with TOTO-3 (Molecular Probes) , a nucleic acid 
counterstain, in several confocal sections' of multiple 
antennae. On average, 36% of the nuclei in the antenna were 
3 0 ELAV positive. 

r>ORlQ4-lacZ Transaene Construction and Histochemical Staining 

A genomic clone containing the DOR104 coding region and 
several kb of upstream sequence was isolated from a genomic 
3 5 library prepared from flies isogenic for the third chromosome 
(a gift of Kevin Moses and Gerry Rubin) . Approximately 3 kb 
of DNA immediately upstream of the putative translation start 



site of DOR104 were isolated by PCR and -subcloned into the 
pCasperAUGSGal vector (Thummel et al . , 1988). S-galactosidase 
activity staining was carried out with whole mount head 
preparations essentially as described in Wang et al . (1998) . 
Frozen sections of DOR104-lacZ maxillary palps were incubated 
with a polyclonal rabbit anti-fi-galactosidase antibody and 
as described above. 
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EXPERIMEN TAL RESULTS 

rionina Ca ndidate Odorant Receptors 

In initial experiments, we isolated a cDNA encoding a 
5 putative odorant receptor by a difference cloning strategy- 
designed to detect cDNA copies of mRNA present at extremely 
low frequencies in an mRNA population. In the antenna and 
maxillary palp, about 30% of the cells are olfactory neurons. 
If each neuron expressed only one of a possible 100 different 

10 odorant receptor genes at a level of 0.1% of the mRNA in a 
sensory neuron, then a given receptor mRNA would be 
encountered at a frequency of one in 300,000 in antennal 
mRNA. If 100 different receptor genes were expressed, then 
the entire family of receptor genes would be represented at 

15 a frequency of one in 3,000 mRNAs . We therefore introduced 
experimental modifications into standard difference cloning 
to allow for the identification of extremely rare mRNAs whose 
expression is restricted to either the antenna or the 
maxillary palp. 

20 

Briefly, 5000 insets from an antennal /maxillary palp cDNA 
library were prescreened (see Experimental Procedures) and 
then subjected to Southern blot hybridization with cDNA 
probes from antennal /maxillary palp, head minus 

25 antenna /maxillary palp, or virgin female body mRNA {see 
Figure 1) . This Southern blot hybridization (or reverse 
Northern) to candidate cDNAs allows for the detection of 
sequences present at a frequency of 1 in 100,000 in the 
probe, a sensitivity about one hundred- fold greater than that 

3 0 of plaque screening (see Experimental Procedures) . This 
procedure led to the identification of multiple 
antennal /maxillary palp-specific cDNAs that were analyzed by 
DNA sequencing and in situ hybridization. One cDNA, DOR104 
(for Drosophila Qdorant Eeceptor) (Figure 1, Lane 9), encodes 

3 5 a putative seven- transmembrane domain protein with no obvious 
sequence similarity to known serpentine receptors (Figure 3) . 
In situ hybridization revealed that this cDNA anneals to 
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about 15% of the 120 sensory neurons within the maxillary 
palp but does not anneal with neurons in either the brain or 
antenna. Seven cells expressing DOR104 are shown in the 
frontal maxillary palp section in Figure 2A. 

5 

These observations suggested that DOR104 might be one member 
of a larger family of odorant receptor genes within the 
Drosophila genome. However, we were unable to identify 
additional genes homologous to DOR104 by low stringency 

10 hybridization to genomic DNA and cDNA libraries or upon 
analysis of linked genes in a genomic walk. We therefore 
analyzed the Drosophila genome database for families of 
multiple transmembrane domain proteins that share sequence 
similarity with DOR104 . Sequences representing about 10% of 

15 the Drosophila genome were downloaded (Berkeley Drosophila 
Genome Project) and subjected to GENS CAN analysis (Burge and 
Karlin, 1997) to predict the intron-exon structure of all 
sequences within the database. Open reading frames greater 
than 50 amino acids were searched for proteins with three or 

2 0 more predicted transmembrane -spanning regions using the dense 
alignment surface (DAS) and TMAP algorithms (Persson and 
Argos, 1994; Cserzo et al . , 1997; also see Experimental 
Procedures) . Of 229 candidate genes identified in this 
manner, 11 encoded proteins that define a novel divergent 

2 5 family of presumed seven transmembrane domain proteins with 
sequence similarity to the DOR104 sequence. This family of 
candidate odorant receptors does not share any conserved 
sequence motifs with previously identified families of seven 
transmembrane domain receptors . cDNA clones containing the 

30 coding regions for 5 of the 11 genes identified by GENS CAN 
analysis have been isolated from an ant ennal /maxillary palp 
cDNA library and their sequences are provided in Figure 3. 
The remaining 6 protein sequences derive from GENSCAN 
predictions for intron-exon arrangement. Their organization 

35 conforms well to the actual structure determined from the 
cDNA sequences of other members of the gene family {Figure 
3) . 
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The receptors consist of a short extracellular N-terminal 
domain (usually less than 50 amino acids) and seven presumed 
membrane -spanning domains. Analysis of presumed transmembrane 
domains (Kyte and Doolittle, 1982; Persson and Argos, 1994; 
5 cserzo et al . , 1997) . reveals multiple hydrophobic segments, 
but it is not possible from this analysis to unequivocally 
determine either the number or placement of the membrane 
spanning domains. At present, our assignment of transmembrane 
domains is therefore tentative. 

10 

The individual family members are divergent and most exhibit 
from 17-26% amino acid identity. Two linked clusters of 
receptor genes constitute small subfamilies of genes with 
significantly greater sequence conservation. Two linked 

15 genes, DOR53 and D0R67, exhibit 76% amino acid identity, 
whereas the three linked genes, D0R71, 72 and 73, reveal 
30-55% identity (Figure 3; see below). Despite the 
divergence, each of the genes shares short, common motifs in 
fixed positions within the putative seven transmembrane 

2 0 domain structure that define these sequences as highly 
divergent members of a novel family of putative receptor 
molecules . 

Ex pression of the DOR Gene Family in Olfactory Neurone 

2 5 If this gene family encodes putative odorant receptors in the 

fly, we might expect that other members of the family in 
addition to DOR104 would also be expressed in olfactory 
sensory neurons. We therefore performed in si tu hybridization 
to examine the pattern of receptor expression of each of the 

3 0 11 additional members of the gene family in adult and 

developing organisms. In Drosophila, olfactory sensory 
neurons are restricted to the maxillary palp and third 
antennal segment. The third antennal segment is covered with 
approximately 500 fine sensory bristles or sensilla (Stocker, 
3 5 19 94) , each containing from one to four neurons (Venkatesh 
and Singh, 1984) . The maxillary palp is covered with 
approximately 60 sensilla, each of which is innervated by cwo 



or three neurons (Singh and Nayak, 1985) . Thus, the third 
antennal segment and maxillary palp contain about 1500 and 
120 sensory neurons, respectively. 



5 RNA in situ hybridization experiments were performed with 
digoxigenin- labeled RNA antisense probes to each of the 11 
new members of the gene family under conditions of high 
stringency. One linked pair of homologous genes, DOR53 and 
DOR67, crosshybridizes, whereas the remaining 10 genes 

10 exhibit no crosshybridization under these conditions (see 
below) . Eight of the 11 genes hybridize to a small 
subpopulation (0.5-1.5%) of the 1500 olfactory sensory 
neurons in the third antennal segment (Figure 4) . One gene, 
DOR71, is expressed in about 10% of the sensory neurons in 

15 the maxillary palp but not in the antenna (Figure 4G) . We 
have not detected expression of DOR46 or DOR19 in the antenna 
or the maxillary palp. Expression of this gene family is only 
observed in cells within the antenna and maxillary palp. No 
hybridization was observed in neurons of the brain, nor was 

2 0 hybridization observed in any sections elsewhere in the adult 
fly or in any tissue at any stage during embryonic 
development. However, we do find hybridization to a small 
number of cells in the developing antennae in the late pupal 
stage (data not shown) . We have not yet determined whether 

25 this family of receptors is expressed in the larval olfactory 
apparatus . 

Only about one third of the cells in the third antennal 
segment and the maxillary palp are neurons (data not shown) , 
which are interspersed with non-neuronal sensillar support 

30 cells and glia. We have performed two experiments to 
demonstrate that the family of seven transmembrane domain 
receptor genes is expressed in sensory neurons rather than 
support cells or glia within the antenna and maxillary palp. 
First, we developed two-color fluorescent antibody detection 

35 schemes to co-localize receptor expression in cells that 
express the neuron-specific RNA binding protein, ELAV 
(Robinow and White, 1988) . An enhancer trap line carrying an 
insertion of GAL4 at the elav locus expresses high levels of 
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lacZ in neurons when crossed to a transgenic UAS-lacZ 
responder line (Lin and Goodman, 1994) . Fluorescent antibody- 
detection of lacZ identifies the sensory neurons in a 
horizontal section of the maxillary palp (Figure 5B) . 
5 Hybridization with the receptor probe DOR104 reveals 
expression in 5 of the 12 lacZ positive cells in a horizontal 
section of the maxillary palp (Figure 5A) . All cells that 
express DOR104 are also positive for lacZ (Figure 5C) , 
indicating that this receptor is expressed only in neurons. 

10 

In a second experiment we have demonstrated that the receptor 
genes are not expressed in non-neuronal cells. The support 
cells of the antenna express different members of a family 
of odorant binding proteins (McKenna et al . , 1994; Pikielny 
15 et al . , 1994; Kim et al . , 1998). These genes encode abundant 
low molecular weight proteins thought to transport odorants 
through the sensillar lymph (reviewed in Pelosi, 1994) . 
Two-color in situ experiments with a probe for the odorant 
binding protein, pbprp2 (Pikielny et al., 1994), reveal 

2 0 hybridization to a large number of cells broadly distributed 

throughout the antenna (Figure 5F) . In the same section, 
however, the probe DOR53 anneals to a non- overlapping 
subpopulation of neurons restricted to the medial -proximal 
domain of the antenna. In a similar experiment, in situ 
25 hybridization with the odorant binding protein, OS-F (McKenna 
et al . , 1994), identifies a spatially restricted 
subpopulation of support cells in the antenna, whereas the 
DOR67 probe identifies a distinct subpopulation of neurons 
in a medial -proximal domain (Figure 5G) . Thus, the putative 

3 0 odorant receptor genes are expressed in a subpopulation of 

sensory neurons distinct from the support cells that express 
the odorant binding proteins. Taken together, these data 
demonstrate that 10 of the 12 family members we have 
identified are expressed in small subpopulations of olfactory 
35 sensory neurons in the antenna and maxillary palp. 
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Spatially Defined Patterns of Receptor Expression 



The in situ hybridization experiments reveal that each 
receptor is expressed in a spatially restricted subpopulation 
of neurons in the antenna or maxillary palp (Figure 4) . The 
5 total number of cells expressing each receptor per antenna 
was obtained by counting the positive cells in serial 
sections of antennae from multiple flies. These numbers are 
presented in the legend of Figure 4. DOR67 and 53, for 
example, anneal to about 20 neurons on the medial proximal 

10 edge of the antenna (Figure 4A and B) , whereas DOR62 and 87 
anneal to subpopulations of 20 cells at the distal edge of 
the antenna (Figure 4C-D) . Approximately 10 cells in the 
distal domain express DOR64 (Figure 4E) . Each of the three 
linked genes D0R71, 72, and 73 is expressed in different 

15 neurons. DOR72 is expressed in approximately 15 antennal 
cells (Figure 4H) , while DOR73 is expressed in 1 to 2 cells 
at the distal edge of the antenna (Figure 41) . In contrast, 
D0R71 is expressed in approximately 10 maxillary palp neurons 
but is not detected in the antenna (Figure 4G) . The three 

20 sensillar types are represented in a coarse topographic map 
across the third antennal segment. The proximal -medial 
region, for example, contains largely basiconic sensilla. 
Receptors expressed in this region (DOR53 and 67) are 
therefore likely to be restricted to the large basiconic 

25 sensilla. More distal regions contain a mixture of all three 
sensilla types and it is therefore not possible from these 
data to assign specific receptors to specific sensillar 
types . 

30 The spatial pattern of neurons expressing a given receptor 
is conserved between individuals. Tn situ hybridization with 
two receptor probes to three individual flies reveals that 
both the frequency and spatial distributions of the 
hybridizing neurons is conserved in different individuals 

3 5 (Figure 6) . At present, we cannot determine the precision of 
this topographic map and can only argue that given receptors 
are expressed in localized domains. 
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In preliminary experiments, we have demonstrated that the 
spatial pattern of expression of one receptor, DOR104, can 
be recapitulated in transgenic flies with a promoter fragment 
flanking the DOR104 gene. The fusion of the presumed DOR104 
5 promoter (consisting of 3 kb of 5' DNA immediately adjacent 
to the coding region) to the lacZ reporter gene has allowed 
us to visualize a subpopulation of neurons expressing DOR104 
within the maxillary palp. Whole mount preparations of the 
heads of transgenic flies reveal a small subpopulation of 

10 sensory neurons within the maxillary palp whose cell bodies 
exhibit blue color after staining with X-gal (Figure 2B) . The 
number of positive cells, approximately 20 per maxillary 
palp, corresponds well with that seen for DOR104 RNA 
expression. Immuno fluorescent staining of sections with 

15 antibodies directed against S-galactosidase more clearly 
reveals the dendrites and axons of these bipolar neurons in 
the maxillary palp (Figure 2C) . Levels of lacZ expression in 
these transgenic lines are low and further amplification will 
be necessary to allow us to trace the axons to glomeruli in 

20 the antennal lobe. Nonetheless, the data suggest that the 
information governing the spatial pattern of DOR104 
expression in a restricted subpopulation of maxillary palp 
neurons resides within 3 kb of DNA 5' to the DOR104 gene. 

2 5 Individual Neurons Express Different Complements of Receptors 

An understanding of the logic of olfactory discrimination in 
Drosophila will require a determination of the diversity and 
specificity of receptor expression in individual neurons. In 
the vertebrate olfactory epithelium, a given neuron is likely 

30 to express only one receptor from the family of 1,000 genes 
(Ngai et al . , 1993; Ressler et al . , 1993; Vassar et al . , 
1993; Chess et al . , 1994; Dulac and Axel, unpublished) . In 
the nematode C. elegans, however, individual chemosensory 
neurons are thought to express multiple receptor genes 

35 (Troemel et al . , 1995). Our observations with the putative 
Drosophila odorant receptors indicate that a given receptor 
probe anneals with 0.5-1.5% of antennal neurons, suggesting 
that each cell expresses only a subset of receptor genes. If 



we demonstrate that each of the different receptor probes 
hybridizes with distinct, nonover lapping subpopulations of 
neurons, this would provide evidence that neurons differ with 
respect to the receptors they express . 

5 

In situ hybridization was therefore performed with either a 
mix of five receptor probes (Figure 4F) or individually with 
each of the five probes (Figure 4A-E) . We observe that the 
number of olfactory neurons identified with the mixed probe 
10 (about 60 per antenna) approximates the sum of the positive 
neurons detected with the five individual probes. These 
results demonstrate that individual receptors are expressed 
in distinct nonover lapping populations of olfactory neurons. 

15 We have performed an additional experiment using two-color 
RNA in situ hybridization to ask whether two receptor genes, 
DOR64 and DOR87, expressed in interspersed cells in the 
distal antenna are expressed in different neurons. Antisense 
RNA probes' for the two genes were labeled with either 

20 digoxigenin- or FITC-UTP and were used in pairwise 
combinations in in situ hybridization to sections through the 
Drosophila antenna. Although these two genes are expressed 
in overlapping lateral -distal domains, two-color in situ 
hybridization reveals that neurons expressing DOR64 do not 

2 5 express DOR87, rather each gene is expressed in distinct cell 
populations (Figure 5D and E) . Taken together, these data 
suggest that olfactory sensory neurons within the antenna are 
functionally distinct and express different complements of 
odorant receptors. At the extreme, the experiments are 

30 consistent with a model in which individual neurons express 
only a single receptor gene. 

Our differential cloning procedure identified one additional 
gene, A45, which shares weak identity (24%) with the DOR gene 
35 family over a short region (93 amino acids) . This gene, 
however, does not appear to be a classical member of the DOR 
family: it is far more divergent and significantly larger 
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than the other family members (4 86 amino acids) . This gene 
is expressed in all olfactory sensory neurons (data not 
shown) . If A45 does encode a divergent odorant receptor, then 
it would be present in all sensory neurons along with 
5 different complements of the more classical members of the 
DOR gene family. 

The Size and Organ ization of the Odorant Receptor Gene Family 

How large is the family of odorant receptor genes in 

10 Drosophila? Unlike vertebrate odorant receptors, which share 
40-98% sequence identity at the amino acid level, the fly 
receptors are extremely divergent . The extent of sequence 
similarity between receptor subfamilies ranges from 20-30%. 
The maxillary palp receptor DOR104 is the most distantly 

15 related member of the family with about 17% identity to the 
other receptor genes. Inspection of the receptor sequences 
suggests that Southern blot hybridizations, even those 
performed at low stringency, are unlikely to reveal multiple 
additional members of a gene family. In accord with this, 

20 Southern blot hybridization with receptor probes DOR24, 62, 
and 72, performed at either high or low stringency, reveals 
only a single hybridizing band following cleavage of genomic 
DNA with three different restriction endonucleases (Figure 
7C-E) . The two linked clusters of receptors contain genes 

25 with a greater degree of sequence conservation and define 
small subfamilies of receptor genes. A cluster of three 
receptors, D0R71, 72, and 73, is located at map position 
33B1-2. The antennal receptors DOR72 and 73 are 55% identical 
and both exhibit about 3 0% identity to the third gene at the 

30 locus, DOR71, which is expressed in the maxillary palp. DOR67 
and DOR53, members of a second subfamily, reside within 1 kb 
of each other at map position 22A2-3 and exhibit 76% sequence 
identity. Not surprisingly, these two linked genes 
crosshybridize at low stringency. Southern blots probed with 

35 either DOR67 or DOR53 reveal two hybridizing bands 
corresponding to the two genes within the subfamily but fail 
to detect additional subfamily members in the chromosome 
(Figure 7A and B) . 



The members of the receptor gene family . described here are 
present on all but the small fourth chromosome.- No bias is 
observed toward telomeric or centromeric regions. The map 
positions, as determined from PI and cosmid clones (Berkeley 
5 Drosophila Genome Project; European Drosophila Genome 
Project) are provided in Experimental Procedures. £ 
comparatively large number of receptor genes map tc 
chromosome 2 because the Berkeley Drosophila Genome Project 
has concentrated its efforts on this chromosome. Unlike the 
10 distribution of odorant receptors in nematodes and mammals 
(Ben-Arie et al . , 1994; Troemel et al . , 1995; Robertson, 
1998), only small linked arrays have been identified and the 
majority of the family members are isolated at multiple, 
scattered loci in the Drosophila genome. 

15 

The high degree of divergence among members of the Drosophil£ 
odorant receptor gene family is more reminiscent of the 
family of chemoreceptors in C. elegans than the more highly 
conserved odorant receptors of vertebrates . Estimates of the 

20 size of the Drosophila receptor gene family, therefore, 
cannot be obtained by either Southern blot hybridization o: 
PCR analysis of genomic DNA. Rather, our estimates of th« 
gene family derive from the statistics of small numbers. W< 
detect 12 members of the odorant receptor gene family fror 

25 a Drosophila genome database that includes roughly 10% of th< 
genome. Recognizing a possible bias in our estimate, it seem 
reasonable at present to estimate that the odorant recepto 
family is likely to include 100 to 200 genes. This is i 
accord with independent estimates from in situ hybridizatio 

3 0 experiments that demonstrate that a given receptor prob 
hybridizes with 0.5-1.5% of the neurons. If we assume tha 
a given neuron expresses only a single receptor gene, thes 
observations suggest that the gene family would include 10 
to 200 members. 



35 
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F.XPERIMENTAT, DISCUSSION 

The Size and Divergence of the Gene Family 

We have identified a novel family of seven transmembrane 
domain proteins that is likely to encode the Drosophila 
5 odorant receptors. The number of different receptor genes 
expressed in the neurons of the antenna and maxillary palp 
will reflect the diversity and specificity of odor 
recognition in the fruit fly. How large is the Drosophila 
odorant receptor gene family? We have identified 11 members 

10 of this divergent gene family in the Drosophila DNA database. 

The potential for bias notwithstanding, it seems reasonable 
to assume then that since only 10% of genomic sequence has 
been deposited, this gene family is likely to contain from 
100 to 2 00 genes. However, significant errors in our 

15 estimates could result from bias in the nature of the 
sequences represented in the 10% of the Drosophila genome 
analyzed to date. In situ hybridization experiments 
demonstrating that each of the receptor genes labels from 
0.5-1.5% of the olfactory sensory neurons are in accord with 

20 the estimate of 100 to 200 receptor genes. 

Several divergent odorant receptor gene families, each 
encoding seven transmembrane proteins, have been identified 
in vertebrate and invertebrate species. In mammals, volatile 

25 odorants are detected by a family of as many as 1,000 
receptors each expressed in the main olfactory epithelium 
(Buck and Axel, 1991; Levy et al . , 1991; Parmentier et al . , 
1992 ; Ben-Arie et al . , 1994). This gene family shares 
features with the serpentine neurotransmitter receptors and 

30 is conserved in all vertebrates examined. Terrestrial 
vertebrates have a second anatomically and functionally 
distinct olfactory system, the vomeronasal organ, dedicated 
to the detection of pheromones. Vomeronasal sensory neurons 
express two distinct families of receptors each thought to 

35 contain from 100 to 200 genes: one novel family of serpentine 
receptors (Dulac and Axel, 1995), and a second related to the 
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metabotropic neurotransmitter receptors (Herrada and Dulac, 
1997; Matsunami and Buck, 1997; Ryba and Tirindelli, 1997}. 

In the invertebrate C. elegans , chemosensory receptors are 
5 organized into four gene families that share 20-40% sequence 
similarity within a family and essentially no sequence 
similarity between families (Troemel et al . , 1995; Sengupta 
et al . , 1996; Robertson, 1998). The four gene families in C. 
elegans together contain about 1,000 genes engaged in the 

10 detection of odors. The nematode receptors exhibit no 
sequence conservation with the three distinct families of 
vertebrate odorant receptor genes . Our studies reveal that 
Drosophila has evolved an additional divergent gene family 
of serpentine receptors comprised of from 100 to 200 genes. 

15 The observation that a similar function, chemosensory 
detection, is accomplished by at least eight highly divergent 
gene families, sharing little or no sequence similarity, is 
quite unusual. 

2 0 Why is the evolutionary requirement for odorant receptors so 
often met by recruitment of novel gene families rather than 
exploiting pre-existing odorant receptor families in 
ancestral genomes? The character of natural odorants along 
with their physical properties (e.g. aqueous or volatile) 

25 represent important selectors governing the evolution of 
receptor gene families. The use of common "anthropomorphic" 
odorant sets in the experimental analysis of olfactory 
specificity has led to the prevailing view that significant 
overlap exists in the repertoire of perceived odors between 

30 different species. Studies of odorant specificity in 
different species often employ odors at artificially high 
concentrations and may present an inaccurate image of the 
natural repertoire of odorants. We simply do not know the 
nature of the odors that initially led to the ancestral 

35 choice of receptor genes during the evolution of the 
nematode, insect, or vertebrate species. Clearly, vastly 
different properties in salient odors could dictate the 
recruitment of new gene families to effect an old function, 
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olfaction. The character of the odor is not the only 
evolutionary selector. Odorant receptors must interact with 
other components in the signal transduction pathway [G 
proteins (for review see Buck, 1996; Bargmann and Kaplan, 
5 1998) and perhaps even RAMPs (McLatchie et al, 1998) and rho 
(Mitchell et al . , 1998)] that may govern the choice of one 
family of serpentine receptors over another. Moreover, 
mammalian receptors not only recognize odorants in the 
environment but are likely to recognize guidance cues 
10 governing formation of a sensory map in the brain (Wang et 
al . , 1998). Thus, the multiple properties required of the 
odorant receptors might change vastly over evolutionary time 
and this might underlie the independent origins of the 
multiple chemosensory receptor gene families. 

15 

-Pci-aVil ishinq a Topographic Map in the Antenna and the Brain 

We observe that individual receptor genes in the fly are 
expressed in topographically conserved domains within the 
antenna. This highly ordered spatial distribution of receptor 

20 expression differs from that observed in the mammalian 
olfactory epithelium. In mammals, a given receptor can be 
expressed in one of four broad but circumscribed zones in the 
main olfactory epithelium (Ressler et al . , 1993; Vassar et 
al . , 1993). A given zone can express up to 250 different 

25 receptors and neurons expressing a given receptor within a 
zone appear to be randomly dispersed (Ressler et al . , 1993; 
Vassar et al . , 1993). The highly ordered pattern of 
expression observed in the Drosophila antenna might have 
important implications for patterning the projections to the 

3 0 antennal lobe. In visual, somatosensory, and auditory systems 
the peripheral receptor sheet is highly ordered and neighbor 
relations in the periphery are maintained in the projections 
to the brain. These observations suggest that the relative 
position of the sensory neuron in the periphery will 

3 5 determine the pattern of projections to the brain. 

Our data on the spatial conservation of receptor expression 
in the antenna suggest that superimposed upon coarse spatial 
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patterning of olfactory sensilla (Venkatesh and Singh, 1984; 
Ray and Rodrigues, 1995; Reddy et al . , 1997) must be more 
precise positional information governing the choice of 
receptor expression This spatial information might dictate 
5 the fixed topographic pattern of receptor expression in the 
peripheral receptor sheet and at the same time govern the 
ordered sensory projections to the brain. This relationship 
between positional identity and the pattern of neuronal 
projections has been suggested for both peripheral sensory 
10 neurons (Merritt and Whitington, 1995; Grillenzoni et al . , 
1998) and neurons in the embryonic central nervous system of 
Drosophila (Doe and Skeath, 1996) . 

Implications for Sensory Processing 

15 In mammals, olfactory neurons express only one of the 
thousand odorant receptor genes. Neurons expressing a given 
receptor project with precision to 2 of the 1800 glomeruli 
in the mouse olfactory bulb. Odorants will therefore elicit 
spatially defined patterns of glomerular activity such that 

20 the quality of an olfactory stimulus is encoded by the 
activation of a specific combination of glomeruli (Stewart 
et al., 1979; Lancet et al . , 1982; Kauer et al . , 1987; 
Imamura et al., 1992; Mori et al . , 1992; Katoh et al . , 1993; 
Friedrich and Korsching, 1997) . Moreover, the ability of an 

25 odorant to activate a combination of glomeruli allows for the 
discrimination of a diverse array of odors far exceeding the 
number of receptors and their associated glomeruli. In the 
nematode, an equally large family of receptor genes is 
expressed in 16 pairs of chemosensory cells, only three of 

30 which respond to volatile odorants (Bargmann and Horvitz, 
1991; Bargmann et al . , 1993) . This immediately implies that 
a given chemosensory neuron will express multiple receptors 
and that the diversity of odors recognized by the nematode 
might approach that of mammals, but the discriminatory power 

35 is necessarily dramatically reduced. 

What does the character of the gene family we have identified 
in Drosophila tell us about the logic of olfactory processing 
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in this organism? We estimate that the Drosophila odorant 
receptors comprise a family of from 100 to 200 genes. 
Moreover, the pattern of expression of these genes in the 
third antennal segment suggests that individual sensory 
5 neurons express a different complement of receptors and, at 
the extreme, our data are consistent with the suggestion that 
individual neurons express one or a small number of 
receptors. As in the case of mammals, the problem of odor 
discrimination therefore reduces to a problem of the brain 

10 discerning which receptors have been activated by a given 
odorant. If the number of different types of neurons exceeds 
the number of glomeruli (43) (Stocker, 1994; Laissue et al . , 
1999) , it immediately follows that a given glomerulus must 
receive input from more than one kind of sensory neuron. This 

15 implies that a single glomerulus will integrate multiple 
olfactory stimuli. One possible consequence of this model 
would be a loss of discriminatory power while maintaining the 
ability to recognize a vast array of odors. Alternatively, 
significant processing of sensory input may occur in the fly 

2 0 antennal lobe to. afford discrimination commensurate with the 
large number of receptors. 

This model of olfactory coding is in sharp contrast with the 
main olfactory system of vertebrates in which sensory neurons 

2 5 express only a single receptor and converge on only a single 

pair of spatially fixed glomeruli in the olfactory bulb. 
Moreover, each projection neuron in the mammalian bulb 
extends its dendrite to only a single glomerulus. Thus the 
integration and decoding of spatial patterns of glomerular 

3 0 activity, in vertebrates, must occur largely in the olfactory 

cortex. In the fruit fly, the observation that the number of 
receptors may exceed the number of glomeruli suggests that 
individual glomeruli will receive input from more than one 
type of sensory neuron. A second level of integration in the 
3 5 antennal lobe is afforded by subsets of projection neurons 
that elaborate extensive dendritic arbors that synapse with 
multiple glomeruli. Thus, the Drosophila olfactory system 
reveals levels of processing and integration of sensory input 
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in the antennal lobe that is likely to be restricted to 
higher cortical centers in the main olfactory system of 
vertebrates . 

5 Protein and Nucleic ■ Aci d (nt) Sequences of 55 Drosophila 
nrforant Receptor Genea 

The following includes those genes first identified in 1998- 
1999. Protein sequences used single letter amino acid codes. 

10 DOR10 

MEKLRS YEDF I FMANMMFKTLGYDL FHTPKPWWR YLLVRGYFVLCT I SNFYEASMVTT 
RIIEWESLAGSPSKIMRQGLHFFYMLSSQLKFITFMINRKRLLQLSHRLKELYPHKEQ 
NQRKYE VNKYYLSCS TRNVLYVYYFVMWMALEPLVQSQ F I VNVS LGTDLWMMCVS SQ 
I S MHLG YLANMLAS I RP S PETEQQDCDFLAS 1 1 KRHQLM I RLQ KD VNYVFGLLIASNL 
1 5 FTTS CLLCCMAYYTWEGFNWEGI S YMMLFAS VAAQFYWS SHGQML IDLLMTI TYRF 
FAVI RQTVEK 

DQRIOnt 

ATGGAAAAACTACGTTCCTATGAGGATTTCATCTTCATGGCCAACATGATGTTCAAGA 
2 0 CCCTTGGCTACGATCTATTCCATACACCCAAACCCTGGTGGCGCTATCTGCTTGTGCG 
AGGATACTTCGTTTTGTGCACGATCAGCAACTTTTACGAGGCTTCCATGGTGACGACA 
AGGATAATTGAGTGGGAATCCTTGGCCGGAAGTCCCTCCAAAATAATGCGACAGGGTC 
TGCACTTCTTTTACATGTTGAGTAGCCAATTGAAATTTATCACATTCATGATAAATCG 
CAAACGCCTACTGCAGCTGAGCCATCGTTTGAAAGAGTTGTATCCTCATAAAGAGCAA 

2 5 AATCAAAGGAAGTACGAGGTGAATAAATACTACCTATCCTGTTCCACGCGCAATGTTT 

TGTACGTGTACTACTTTGTAATGGTCGTCATGGCACTGGAACCCCTCGTTCAGTCCCA 
GTTCATAGTGAATGTGAGCCTGGGCACAGATCTGTGGATGATGTGCGTCTCAAGCCAA 
ATATCGATGCACTTGGGCTATCTGGCCAATATGTTGGCCTCCATTCGACCAAGTCCAG 
AAACGGAACAACAAGACTGTGACTTCTTGGCCAGCATTATAAAGAGACATCAACTAAT 

3 0 GATCAGGCTTCAAAAGGACGTGAACTATGTTTTTGGACTCTTATTGGCATCTAATCTG 

TTTAC CACATCCTG TTTACTTTG CTGCATGG CGTACT ATAC CGTCGTCGAAGGTTTCA 
ATTGGGAGGGCATTTCCTATATGATGCTCTTTGCTAGTGTAGCTGCCCAGTTCTACGT 
TGTCAGCTCACACGGACAAATGTTAATAGATTTGTTGATGACCATCACATACAGATTT 
TTCGCGGTTATACGACAAACTGTAGAAAAG 
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DOR104 

MASLQFHGNVDADIRYDISLDPARESNLFRLLMGLQLANGTKPSPRLPKWWPKRLEMI 
GKVLPKAyCSMVIFTSLHLGVLFTKTTLDVLPTGELQAITDALTMTIIYFFTGyGTIY 
WCLRSRRLLAYMEHMNREYRHHSLAGVTFVSSHAAFRMSRNFTWWIMSCLLGVISWG 
5 VSPLMLGIRMLPLQCWYPFDALGPGTYTAVYATQLFGQIMVGMTFGFGGSLFVTLSLL 
LLGQFDVLYCSLKNLDAHTKLLGGESVNGLSSLQEELLLGDSKRELNQYVLLQEHPTD 
LLRLSAGRKCPDQGNAFfiNALVECIRLHRFILHCSQELENLFSPYCLVKSLQITFQLC 
LLVFVGVSGTREVLRI VNQLQYLGLTI FELLMFTYCGELLSRHS IRSGDAFWRGAWWK 
HAHFIRQD I LI FLVNSRRAVHVTAGKFYVMDVNRLRS VI TQAFS FLTLLQKLAAKKTE 
10 SEL 

DORl04nt 

GAATTCGGCACGAGCAGTCGATGGCCAGTCTTCAGTTCCACGGCAACGTCGATGCGGA 
CATCAGGTATGATATTAGCCTGGATCCGGCTAGGGAATCGAATCTCTTCCGTCTGCTA 

1 5 ATGGGACTC CAGTTGGCGAATGGCACGAAGCCATCGCCGCGGTTACCCAAATGGTGGC 
CAAAGCGGCTGGAAATGATTGGTAAAGTGCTGCCCAAAGCCTATTGTTCCATGGTGAT 
TTTCACCTCCCTGCATTTGGGTGTCCTGTTCACGAAAACCACACTGGATGTCCTGCCG 
ACGGGGGAGCTGCAGGCCATAACGGATGCCCTCACCATGACCATAATATACTTTTTCA 
CGGGCTACGGCACCATCTACTGGTGCCTGCGCTCCCGGCGCCTCTTGGCCTACATGGA 

2 0 GCACATGAACCGGGAGTATCGCCATCATTCGCTGGCCGGGGTGACCTTTGTGAGTAGC 
CATGCGGCCTTTAGGATGTCCAGAAACTTCACGGTGGTGTGGATAATGTCCTGCCTGC 
TGGGCGTGATTTCCTGGGGCGTTTCGCCACTGATGCTGGGCATCCGGATGCTGCCGCT 
CCAATGTTGGTATCCCTTCGACGCCCTGGGTCCCGGCACATATACGGCGGTCTATGCT 
ACACAACTTTTCGGTCAGATCATGGTGGGCATGACCTTTGGATTCGGGGGATCACTGT 

2 5 TTGTCACCCTGAGCCTGCTACTCCTGGGACAATTCGATGTGCTCTACTGCAGCCTGAA 

GAACCTGGATGCCCATACCAAGTTGCTGGGCGGGGAGTCTGTAAATGGCCTGAGTTCG 
CTGCAAGAGGAGTTGCTGCTGGGGGACTCGAAGAGGGAATTAAATCAGTACGTTTTGC 
TCCAGGAGCATCCGACGGATCTGCTGAGATTGTCGGCAGGACGAAAATGTCCTGACCA 
AGGAAATGCGTTTCACAACGCCTTGGTGGAATGCATTCGCTTGCATCGCTTCATTCTG 

3 0 CACTGCTCACAGGAGTTGGAGAATCTATTCAGTCCATATTGTCTGGTCAAGTCACTGC 

AGATCACCTTTCAGCTTTGCCTGCTGGTCTTTGTGGGCGTTTCGGGTACTCGAGAGGT 
CCTGCGGATTGTCAACCAGCTACAGTACTTGGGACTGACCATCTTCGAGCTCCTAATG 
TTCACCTATTGTGGCGAACTCCTCAGTCGGCATAGTATTCGATCTGGCGACGCCTTTT 
GGAGGGGTGCGTGGTGGAAGCACGCCCATTTCATCCGCCAGGACATCCTCATCTTTCT 
3 5 GGTCAATAGTAGACGTGCAGTTCACGTGACTGCCGGCAAGTTTTATGTGATGGATGTG 
AATCGTCTAAGATCGGTTATAACGCAGGCGTTCAGCTTCTTGACTTTGCTGCAAAAGT 
TGGCTGCCAAGAAGACGGAATCGGAGCTCTAAACTGGTACCACGCATCGATATTTATT 
TAGCGCATTAAAAAAAAGTCGAGTAAAAGCAAAAAAAAAAAAAAAAAAA 



DOR105 

MFED I QL I YMN I KI LRFWALL YDKNLRRYVC I GLAS FHI FTQ I VYMMSTNEGLTG 1 1 R 
NSYMLVLWINTVLRAYLLI^HDRYIALIQKLTEAYYDLLNLNDSYISEILDQVNKVG 
KLMARGNLFFGMLTSMGFGLYPLSSSERVLPFGSKIPGLNEYESPYYEMWYIFQMLIT 
PMGCCMYIPYTSLIVGLIMFGIVRCKALQHRLRQVALKHPYGDRDPRELREEIIACIR 
YQQS I IEYMDHINELTTMMFLFELMAFSALLCALLFMLI I VSGTSQLI IVCMYINMIL 
AQ I LALYWYANELREQNLAVATAAYETEWFTFDVPLRKNI LFMMMRAQRPAAI LLGNI 
RP I TLELFQNLLNTTYTFFTVLKRVYG 

noRlOSnt 

ATGTTTGAAGACATTCAGCTAATCTACATGAATATCAAGATATTGCGATTCTGGGCCC 
TGCTCTATGACAAAAACTTGAGGCGTTATGTGTGCATTGGACTGGCCTCATTCCACAT 
CTTCACCCAAATCGTCTACATGATGAGTACCAATGAAGGACTAACCGGGATAATTCGT 
AACTCATATATGCTCGTCCTTTGGATTAATACGGTGCTGCGAGCTTATCTCTTGCTGG 
CGGATCACGACAGATATTTGGCTTTGATCCAAAAACTAACTGAGGCCTATTACGATTT 
ACTGAATCTGAACGATTCGTATATATCGGAAATATTGGACCAGGTGAACAAGGTGGGA 
AAGTTGATGGCTAGGGGCAATCTGTTCTTTGGCATGCTCACATCCATGGGATTCGGTC 
TGTACCCATTGTCCTCCAGCGAAAGAGTCCTGCCATTTGGCAGCAAAATTCCTGGTCT 
AAATGAGTACGAGAGTCCGTACTATGAGATGTGGTACATCTTTCAGATGCTCATCACC 
CCGATGGGCTGTTGCATGTACATTCCGTACACCAGTCTGATTGTGGGCTTGATAATGT 
TCGGCATTGTGAGGTGCAAGGCTTTGCAGCATCGCCTCCGCCAGGTGGCGCTTAAGCA 
TCCGTACGGAGATCGCGATCCCCGTGAACTGAGGGAGGAGATCATAGCCTGCATACGT 
TACCAGCAGAGCATTATCGAGTACATGGATCACATAAACGAGCTGACCACCATGATGT 
TCCTATTCGAACTGATGGCCTTTTCGGCGCTGCTCTGTGCGCTGCTCTTTATGCTGAT 
TATCGTCAGCGGCACCAGTCAGCTGATAATTGTTTGCATGTACATTAACATGATTCTG 
GCCCAAATACTGGCCCTCTATTGGTATGCAAATGAGTTAAGGGAACAGAATCTGGCGG 
TGGCCACCGCAGCCTACGAAACGGAGTGGTTCACCTTCGACGTTCCACTGCGCAAAAA 
CATCCTGTTCATGATGATGAGGGCACAGCGGCCAGCTGCAATACTACTGGGCAATATA 
CGCCCCATCACTTTGGAACTGTTCCAAAACCTACTGAACACAACCTATACATTTTTTA 
CGGTTCTCAAGCGAGTCTACGGA 

DOR107 

MYPRFLS RNYPLAKHLFFVTRYS FGLLGLRFGKEQSWLHLLWLVFNFVNLAHCCQAEF 
VFGWS HLRT S P VDAMDAFCPLACS FTTLFKLG WMWWRRQEVADLMDR I RLL I GEQEKR 
EDSRRKVAQRSYYLMVTRCGMLVFTLGSITTGAFVLRSLWEMWVRRHQEFKFDMPFRM 
LFHDFAHRMPWFPWYLYSTWSGQVTVYAFAGTDGFFFGFTLYMAFLLQALRYDIQDA 
LKP I RDPS LRES KI CCQRLAD I VDRHNE IEKI VK5FSG I MAAPTFVHFVS ASLVI ATS 
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VI D I LLYS G YNI IRYWYTFTVS SAI FLYCYGGTEMSTES LSLGEAAYS SAWYTWDRE 
TRRRVFL 1 1 LRAQRP I TVRVPFFAPS LPVFTS VI KFTGS I VALAKT IL 

DOR107nt 

5 ATGTATCCGCGATTCCTCAGCCGTAACTATCCGCTGGCCAAGCATTTGTTCTTCGTCA 
CCAGATACTCCTTTGGCCTGCTGGGCCTGAGATTTGGCAAAGAGCAATCGTGGCTTCA 
CCTCTTGTGGCTGGTGTTCAATTTCGTTAACCTGGCGCACTGCTGCCAGGCGGAGTTC 
GTCTTCGGCTGGAGTCACTTGCGCACCAGTCCCGTGGATGCCATGGACGCCTTTTGTC 
CTCTGGCCTGCAGTTTCACCACGCTCTTCAAGCTGGGATGGATGTGGTGGCGTCGCCA 

1 0 GGAAGTAGCTGATCTAATGGACCGCATCCGCTTGCTCATCGGGGAGCAGGAGAAGAGG 
GAGGACTCCCGGAGAAAGGTGGCTCAAAGGAGCTACTATCTCATGGTCACCAGGTGCG 
GTATGCTGGTCTTCACCCTGGGCAGCATTACCACTGGAGCCTTCGTTCTGCGTTCCCT 
TTGGGAAATGTGGGTGCGTCGTCATCAGGAGTTCAAATTCGATATGCCCTTTCGCATG 
CTGTTCCACGACTTTGCGCATCGCATGCCCTGGTTTCCAGTTTTCTATCTCTACTCCA 

1 5 CATGGAGTGGCCAGGTCACTGTGTACGCCTTTGCTGGTACAGATGGTTTCTTCTTTGG 
CTTTACCCTCTACATGGCCTTCTTGCTGCAGGCCTTAAGATACGATATCCAGGATGCC 
CTCAAGCCAATAAGAGATCCCTCGCTTAGGGAATCCAAAATCTGCTGTCAGCGATTGG 
CGGACATCGTGGATCGCCACAATGAGATAGAGAAGATAGTCAAGGAATTTTCTGGAAT 
TATGGCTGCTCCAACTTTTGTTCACTTCGTATCAGCCAGCTTAGTGATAGCCACCAGC 

2 0 GTCATTGATATACTATTGTATTCCGGCTATAACATCATCCGTTACGTGGTGTACACCT 
TCACGGTTTCCTCGGCCATCTTCCTCTATTGCTACGGAGGCACAGAAATGTCAACTGA 
GAGCCTTTCCTTGGGAGAAGCAGCCTACAGCAGTGCCTGGTATACTTGGGATCGAGAG 
ACCCGCAGGCGGGTCTTTCTCATTATCCTGCGTGCTCAACGACCCATTACGGTGAGGG 
TGCCCTTTTTTGCACCATCGTTACCAGTCTTCACATCGGTCATCAAGTTTACAGGTTC 

2 5 GATTGTGG CACTGGCTAAGACGATACTG 

DOR108 

MDKHKDRIESMRLILQVMQLFGLWPWSLKSEEEWTFTGFVKRNYRFLLHLPITFTFIG 
LMWLEAFI SSNLEQAGQVLYMS ITEMALWKI LS I WHYRTEAWRLMYELQHAPDYQLH 

3 0 NQEEVDFWRREQRFFKWFFYI YILI SLGWYSGCTGVLFLEGYELPFAYYVPFEWQNE 

RRYWFAYGYDr4AGMTLTCISNITLDTLGCYFLFHISLLYRLLGLRLRETKNMKNDTIF 
GQQLRAIFIMHQRIRSLTLTCQRIVSPYILSQIILSALIICFSGYRLQHVGIRDNPGQ 
F I SMLQFVS VMI LQI YLPCYYGNE I TVYANQLTNEVYHTNWLECRPP I RKLLNAYMEH 
LKKPVTI RAGNS FAVGLP I FVKTINNAYS FLALLLNVSN 

35 



DORl08nt 

ATGGATAAACACAAGGATCGCATTGAATCCATGCGCCTAATTCTTCAGGTCATGCAAC 
TATTTGGCCTCTGGCCGTGGTCCTTGAAATCGGAAGAGGAGTGGACTTTCACCGGTTT 
TGTAAAGCGCAACTATCGCTTCCTGCTCCATCTGCCCATTACCTTCACCTTTATTGGA 
5 CTCATGTGGCTGGAGGCCTTCATCTCGAGCAATCTGGAGCAGGCTGGCCAGGTTCTGT 
ACATGTCCATCACCGAGATGGCTTTGGTGGTGAAAATCCTGAGCATTTGGCACTATCG 
CACCGAAGCTTGGCGGCTGATGTACGAACTCCAACATGCTCCGGACTACCAACTCCAC 
AACCAGGAGGAGGTAGACTTTTGGCGCCGGGAGCAACGATTCTTCAAGTGGTTCTTCT 
ACATCTACATTCTGATTAGCTTGGGCGTGGTATATAGTGGCTGCACTGGAGTACTTTT 

1 0 TCTGGAGGGCTACGAACTGCCCTTTGCCTACTACGTGCCCTTCGAATGGCAGAACGAG 
AGAAGGTACTGGTTCGCCTATGGTTACGATATGGCGGGCATGACGCTGACCTGCATCT 
CAAACATTACCCTGGACACCCTGGGTTGCTATTTCCTGTTCCATATCTCTCTTTTGTA 
CCGACTGCTTGGTCTGCGATTGAGGGAAACGAAGAATATGAAGAATGATACCATTTTT 
GGCCAGCAGTTGCGTGCCATCTTCATTATGCATCAGAGGATTAGAAGCCTAACCCTGA 

1 5 CCTGCCAGAGAATCGTATCTCCCTATATCCTATCTCAGATCATTTTGAGTGCCCTGAT 
CATCTGCTTTAGTGGATACCGCTTGCAGCATGTGGGAATTCGCGATAATCCCGGCCAG 
TTTATATCCATGTTGCAGTTTGTCAGTGTGATGATCCTGCAGATTTACTTGCCCTGCT 
ACTATGGAAACGAGATAACCGTGTATGCCAATCAGCTGACCAACGAGGTTTACCATAC 
CAATTGGCTGGAATGTCGGCCACCGATTCGAAAGTTACTCAATGCCTACATGGAGCAC 

2 0 CTGAAGAAACCGGTGACCATCCGGGCTGGCAACTCCTTCGCCGTGGGACTACCAATTT 

TTGTTAAGACCATCAACAACGCCTACAGTTTCTTGGCTTTATTACTAAATGTATCGAA 
T 

DOR109 

25 MESTNRLSAIQTLLVIQRWIGLLKWENEGEDGVLTWLKRIYPFVLHLPLTFTYIALJW 
YEAI TS SDFEEAGQVLYMS I TELALVTKLLNI WYRRHEAASL IHELQHD PAFNLRNS E 
E I KF WQQNQRNF KR I FYWY I WGSLFVAVMGY I S VFFQED YELPFG YYVPFEWRTRER Y 
FYAWGYNWAMTLCCLSNILLDTLGCYFMFHIASLFRLLGMRLEAX.KNAAEEKARPEL 
RRIFQLHTKVRRLTRECEVLVSPYVLSQWFSAFIICFSAYRLVHMGFKQRPGLFVTT 

3 0 VQFVAVMI VQI FLPCYYGNELTFHANALTNSVFGTNWLEYSVGTRKLLNCYMEFLKRP 

VKVRAG VF FEIGLPI FVKT I NNAYS FFALLLK I S K 

DOR109nt 

ATGGAGTCTACAAATCGCCTAAGTGCCATCCAAACACTTTTAGTAATCCAACGTTGGA 
3 5 TAGGACTTCTTAAATGGGAAAACGAGGGCGAGGATGGAGTATTAACCTGGCTAAAACG 
AATATATCCTTTTGTACTGCACCTTCCACTGACCTTCACGTATATTGCCTTAATGTGG 
TATGAAGCTATTACATCGTCAGATTTTGAGGAAGCTGGTCAAGTTCTGTACATGTCCA 
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TCACCGAACTGGCATTGGTCACTAAACTGCTGAATATTTGGTATCGTCGTCATGAAGC 
TGCTAGTCTAATCCACGAATTGCAACACGATCCCGCATTTAATCTGCGCAATTCGGAG 
GAAATCAAATTCTGGCAGCAAAATCAGAGGAACTTTAAGAGAATATTTTACTGGTACA 
TCTGGGGCAGCCTTTTCGTGGCTGTAATGGGTTATATAAGCGTGTTTTTCCAGGAGGA 
TTACGAGCTGCCCTTTGGCTACTACGTGCCATTCGAGTGGCGCACCAGGGAACGATAC 
TTCTACGCTTGGGGCTATAATGTGGTGGCCATGACCCTGTGCTGTCTATCCAACATCC 
TACTGGACACACTAGGCTGTTATTTCATGTTCCACATCGCCTCGCTTTTCAGGCTTTT 
GGGAATGCGACTGGAGGCCTTGAAAAATGCAGCCGAAGAGAAAGCCAGACCGGAGTTG 
CGCCGCATTTTCCAACTGCACACTAAAGTCCGCCGATTGACGAGGGAATGCGAAGTGT 
TAGTTTCACCCTATGTTCTATCCCAAGTGGTCTTCAGTGCCTTCATCATCTGCTTCAG 
TGCCTATCGACTGGTGCACATGGGCTTCAAGCAGCGACCTGGACTCTTCGTGACCACC 
GTGCAATTCGTGGCCGTCATGATCGTCCAGATTTTCTTGCCCTGTTACTACGGCAATG 
AGTTGACCTTTCATGCCAATGCACTCACTAATAGTGTCTTCGGTACCAATTGGCTGGA 
G TACTC CGTGGGCACTC G CAAG CTG CTTAACTGCTACATGGAGTTCCTCAAGCGACCG 
GTTAAAGTGCGAGCTGGGGTGTTCTTTGAAATAGGACTACCCATCTTTGTGAAGACCA 
TCAACAATGCCTACAGTTTCTTCGCCCTGCTGCTAAAGATATCCAAG 

DOR110 

MLFNYLRKPNPTNLLTSPDSFRYFEYGMFCMGWHTPATHKIIYYITSCLIFAWCAVYL 

PIGI 1 1 SFKTDINTFTPNELLTVMQLFFNSVGMPFKVLFFNLYI SGFYKAKiaJiSEMD 

KRCTTLKERVEVHQGWRCNKAYI.IYQFIYTAYTISTFL.SAALSGKLPMRIYNPFVDF 

RESRSSFWKAALNETALMLFAVTQTLMSDIYPLLYGLILRVHLKLLRLRVESLCTDSG 

KSDAENEQDLINYAAAIRPAVTRTIFVQFLLIGICLGLSMINLLFFADIWTGLATVAY 

INGLMVQTFPFCFVCDLLKKDCELLVSAIFHSNWINSSRSYKSSLRYFLKNAQKSIAF 

TAGS I FP I S TGSNI KVAKLAFS WTFVNQLNI ADRLTKN 

DQRllOnt 

ATGTTGTTCAACTATCTGCGAAAGCCGAATCCCACAAACCTTTTGACTTCTCCGGACT 
CATTTAGATACTTTGAGTATGGAATGTTTTGCATGGGATGGCACACACCAGCAACGCA 
TAAGATAATCTACTATATAACATCCTGTTTGATTTTTGCTTGGTGTGCCGTATACTTG 
CCAATCGGAATCATCATTAGTTTCAAAACGGATATTAACACATTCACACCGAATGAAC 
TGTTGACAGTTATGCAATTATTTTTCAATTCAGTGGGAATGCCATTCAAGGTTCTGTT 
CTTCAATTTGTATATTTCTGGATTTTACAAGGCCAAAAAGCTCCTTAGCGAAATGGAC 
AAACGTTGCACCACTTTGAAGGAGCGAGTGGAAGTGCACCAAGGTGTGGTCCGTTGCA 
ACAAGGCCTACCTCATTTACCAGTTCATTTATACCGCGTACACTATTTCAACATTTCT 
ATCGGCGGCTCTTAGTGGAAAATTGCCATGGCGCATCTATAATCCTTTTGTGGATTTT 
CGAGAAAGTAGATCCAGTTTTTGGAAAGCTGCCCTCAACGAGACAGCACTTATGCTAT 
TTGCTGTGACTCAAACCCTAATGAGTGATATATATCCACTGCTTTATGGTTTGATCCT 
GAGAGTTCACCTCAAACTTTTGCGACTAAGAGTGGAGAGCCTGTGCACAGATTCTGGA 
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AAAAGCGATGCTGAAAACGAGCAAGATTTGATTAACTATGCTGCAGCAATACGACCAG 
CGGTTACCCGCA CAATTTTCGTTCAATTCCTCTTGATCGGAATTTGCCTTGGCCTTTC 
AATGATCAATCTACTCTTCTTTGCCGACATCTGGACAGGATTGGCCACAGTGGCTTAC 
ATCAATGGTCTAATGGTGCAGACATTTCCATTTTGCTTCGTTTGTGATCTACTCAAAA 
5 AGGATTGTGAACTTCTTGTGTCGGCCATATTTCATTCCAACTGGATTAATTCAAGCCG 
CAGTTACAAGTCATCTTTGAGATATTTTCTGAAGAACGCCCAGAAATCAATTGCTTTT 
ACAGCCGGCTCTATTTTTCCCATTTCTACTGGCTCGAATATTAAGGTGGCTAAGCTGG 
CATTTTCGGTGGTTACTTTTGTCAATCAACTTAACATAGCTGACAGATTGACAAAGAA 
C 

10 

DOR111 

MLFRKRKPKSDDEVITFDELTRFPMTFYKTIGEDLYSDRDPNVIRRYLLRFYLVLGFL 
NFNA YWGE I AYF IVHIMS TTTLLEATAVAPC I GFS FMADFKQFGLTVNRKRLVRLLD 
DLKE I FPLDLEAQRKYNVS F YRKHMNR VMTLFTILCMT YTS S FS FYPAI KSTI KYYL.M 
15 GSEIFERNYGFHILFPYDAETDLTVYWFSYWGLAHCAYVAGVSYVCVDLLLIATITQL 
TMHFNFIA1TOLEAYEGGDHTDEF^IKYLHNLVVYHARAX.DINKKCTFQSSRIGHSAFN 
QNWLPCSTKYKRILQFIIARSQKPASIRPPTFPPISFNTFMKVISMSYQFFALLRTTY 
YG 

20 DORlllnt 

ATGCTGTTCCGCAAACGTAAGCCAAAAAGTGACGATGAAGTCATCACCTTCGACGAAC 
TTACCCGGTTTCCGATGACTTTCTACAAGACCATCGGCGAGGATCTGTACTCCGATAG 
GGATCCGAATGTGATAAGGCGTTACCTGCTACGTTTTTATCTGGTACTCGGTTTTCTC 
AACTTCAATGCCTATGTGGTGGGCGAAATCGCGTACTTTATAGTCCATATAATGTCGA 

2 5 CGACTACTCTTTTGGAGGCCACTGCAGTGGCACCGTGCATTGGCTTCAGCTTCATGGC 

CGACTTTAAGCAGTTCGGTCTCACAGTGAATAGAAAGCGATTGGTCAGATTGCTGGAT 
GATCTCAAGGAGATATTTCCTTTAGATTTAGAAGCGCAGCGGAAGTATAACGTATCGT 
TTTACCGGAAACACATGAACAGGGTCATGACCCTATTCACCATCCTCTGCATGACCTA 
CACCTCGTCATTTAGCTTTTATCCAGCCATCAAGTCGACCATAAAGTATTACCTTATG 

3 o GGATCGGAAATCTTTGAGCGCAACTACGGATTTCACATTTTGTTTCCCTACGACGCAG 

AAACGGATCTGACGGTCTACTGGTTTTCCTACTGGGGATTGGCTCATTGTGCCTATGT 
GGCCGGAGTTTCCTACGTCTGCGTGGATCTCCTGCTGATCGCGACCATAACCCAGCTG 
ACCATGCACTTCAACTTTATAGCGAATGATTTGGAGGCCTACGAAGGAGGTGATCATA 
CGGATGAAGAAAATATCAAATACCTGCACAACTTGGTCGTCTATCATGCCAGGGCGCT 
3 5 GGATATTAACAAGAAATGTACATTTCAGAGCTCTCGGATTGGCCATTCGGCATTTAAT 
CAGAACTGGTTGCCATGCAGCACCAAATACAAACGCATCCTGCAATTTATTATCGCGC 
GCAGCCAGAAGCCCGCCTCTATAAGACCGCCTACCTTTCCACCCATATCTTTTAATAC 
CTTTATGAAGGTAATCAGCATGTCGTATCAGTTTTTTGCACTGCTCCGCACCACATAT 
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TATGGT 
DOR114 

MLTKIO^TQSAKEQEKLKAIPLHSFLKYANVFyLSIGMMAYDHKYSQKWKEVLLHWTFI 
5 AQMVNLNTVL I S E L I YVFLA I GKG SNFLEATMNL S F I G F VI VGDFKI WNI SRQR'KRLT 
QWSRLEELHPQGLAQQEP YNI GHHLSGYSRYSKF YFGMHMVLI WTYNLYWAVYYLVC 
DFWLGMRQFERMLPYYCWVPWDWSTGYS YYFMYI SQNI GGQACLSGQLAADMLMCALV 
TLWMHFIRLSAHIESHVAGIGSFQHDLEFLQATVAYHQSLIHLCQDINEIFGVSLLS 
NFVS SSFII CFVGFQMTI GSKIDNLVMLVLFLFCAMVQVFMI ATHAQRLVDASEQ IGQ 
1 0 A VYNHDWFRADLR YRKML I L 1 1 KRAQQP S RL KATMFLNI S LVTVSDLLQLS YKFFAJLL 

RTMYVN 

DQR114nt 

ATGTTGACTAAGAAGGATACTCAAAGTGCCAAGGAGCAGGAAAAGTTGAAGGCCATTC 
1 5 CATTGCACAGCTTTCTGAAATATGCCAACGTGTTCTATTTATCGATTGGAATGATGGC 
CTACGATC^CAAGTACAGTC^AAAGTGGAAGGAGGTCCTGCTGCACTGGACATTCATT 
GCCCAGATGGTCAATCTGAATACAGTGCTCATCTCGGAACTGATTTACGTATTCCTGG 
CGATCGGCAAAGGTAGCAATTTTCTGGAGGCCACCATGAATCTGTCTTTCATTGGATT 
TGTCATCGTTGGTGACTTCAAAATCTGGAACATTTCGCGGCAGAGAAAGAGACTCACC 
2 0 CAAGTGGTCAGCCGATTGGAAGAACTGCATCCGCAAGGCTTGGCTCAACAAGAACCCT 
ATAATATAGGGCATCATCTGAGCGGCTATAGCCGATATAGCAAATTTTACTTCGGCAT 
GCACATGGTGCTGATATGGACGTACAACCTGTATTGGGCCGTTTACTATCTGGTCTGT 
GATTTCTGGCTGGGAATGCGTCAATTTGAGAGGATGCTGCCCTACTACTGCTGGGTTC 
CCTGGGATTGGAGTACCGGATATAGCTACTATTTCATGTATATCTCACAGAATATCGG 

2 5 CGGTCAGGCTTGTCTGTCCGGTCAGCTAGCAGCTGACATGTTAATGTGCGCCCTGGTC 

ACTTTGGTGGTGATGCACTTCATCCGGCTTTCCGCTCACATCGAGAGTCATGTTGCGG 
GCATTGGCTCATTCCAGCACGATTTGGAGTTCCTCCAAGCGACGGTGGCGTATCACCA 
GAGCTTGATCCACCTCTGCCAGGATATCAATGAGATATTCGGTGTTTCACTGTTGTCC 
AACTTTGTATCCTCGTCGTTTATCATCTGCTTCGTGGGTTTCCAGATGACCATCGGCA 
•3 0 GCAAGATCGACAACCTGGTAATGCTTGTGCTTTTCCTGTTTTGTGCCATGGTTCAGGT 
CTTCATGATTGCCACCCATGCTCAGAGGCTCGTTGATGCGAGTGAACAGATTGGTCAA 
GCGGTCTATAATCACGACTGGTTCCGTGCTGATCTGCGGTATCGTAAAATGCTGATCC 
TGATTATTAAGAGGGCCCAACAGCCGAGTCGACTCAAGGCCACAATGTTCCTGAACAT 
CTCACTGGTCACCGTGTCGGATCTCTTGCAACTCTCGTACAAATTCTTTGCCCTTCTG 

3 5 CGCACAATGTACGTGAAT 
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D0R115 

MEKLMKYASFFYTAVGIRPYTNGEESKMNKLIFHIVFWSNVINLSFVGLFESIYVYSA 
FMDNKFLEAVTALS YI GFVTVGMS KMFFI RWKKTAI TEL INELKE I YPNGL I REERYN 
L PMYLGTC S R I S LI YSLLYS VL I WTFNLFCVME YWVYDKWLN I RWGKQLP YLM Y I PW 
5 KWQDNWSYYPLLFSQNFAGYTSAAGQISTDVLLCAVATQLVMHFDFLSNSMERHELSG 
DWKKDSRFLVDI VRYHERI LRLSDAVNDI FGI PLLLNFMVS S FVI CFVGFQMTVGVPP 
DI WKLFLFLVSSMSQVYLICHYGQLVADASYGFSVATYNQKWYKADVRYKRALVI 1 1 
ARSQKVTFLKATI FLD I TRSTMTDVRNCVLS V 

10 DORllSnt 

ATGGAGAAGCTAATGAAGTACGCTAGCTTCTTCTACACAGCAGTGGGCATACGGCCAT 
ATACCAATGGTGAAGAATCCAAAATGAACAAACTTATATTTCACATAGTTTTTTGGTC 
CAATGTGATTAACCTCAGCTTCGTTGGATTATTTGAGAGCATTTACGTTTACAGTGCC 
TTCATGGATAATAAGTTCCTGGAAGCAGTCACTGCGTTGTC CTACATTGG CTTCGTAA 

1 5 CCGTAGGCATGAGCAAGATGTTCTTCATCCGGTGGAAGAAAACGGCTATAACTGAACT 
GATTAATGAATTGAAGGAGATCTATCCGAATGGTTTGATCCGAGAGGAAAGATACAAT 
CTGCCGATGTATCTGGGCACCTGCTCCAGAATCAGCCTTATATATTCCTTGCTCTACT 
CTGTTCTCATCTGGACATTCAACTTGTTTTGTGTAATGGAGTATTGGGTCTATGACAA 
GTGGCTCAACATTCGAGTGGTGGGCAAACAGTTGCCGTACCTCATGTACATTCCTTGG 

2 0 AAATGGCAGGATAACTGGTCGTACTATCCACTGTTATTCTCCCAGAATTTTGCAGGAT 
ACACATCTGCAGCTGGTCAAATTTCAACCGATGTCTTGCTCTGCGCGGTGGCCACTCA 
GTTGGTAATGCACTTCGACTTTCTCTCAAATAGTATGGAACGCCACGAATTGAGTGGA 
GATTGGAAGAAGGACTCCCGATTTCTGGTGGACATTGTTAGGTATCACGAACGTATAC 
TCCGCCTTTCAGATGCAGTGAACGATATATTTGGAATTCCACTACTACTCAACTTCAT 

2 5 GGTATCCTCGTTCGTCATCTGCTTCGTGGGATTCCAGATGACTGTTGGAGTTCCGCCG 
GATATAGTTGTGAAGCTCTTCCTCTTCCTTGTCTCTTCGATGAGTCAGGTCTATTTGA 
TTTGTCACTATGGTCAACTGGTGGCCGATGCTAGCTACGGATTTTCGGTTGCCACCTA 
CAATCAGAAGTGGTATAAAGCCGATGTGCGCTATAAACGAGCCTTGGTTATTATTATA 
GCTAGATCGCAGAAGGTAACTTTTCTAAAGGCCACTATATTCTTGGATATTACCAGGT 

30 C CACTATGACAGATGTACGCAACTGTGTATTGTCAGTG 

DOR116 

MELL P LAMLMYDGTRVTAMQ YL I PGLPLENNYCYWTYMI QTVTMLVQGVGFYSGDLF 
VFLGLTQ ILTFADMLQVKVKELNDALEQKAEYRALVRVGAS I DGAENRQRLLLDVI RW 
35 HQLFTDYCRAINALYYELIATQVLSMALAMMLSFCINLSSFHMPSAIFFWSAYSMSI 
YCILGTI LEFAYDQVYES I CNVTWYELSGEQRKLFGFLLRESQ YPHNIQ I LGVMSLS V 
RTALQIVKLIYSVSMMMMNRA 
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DORll6nt 

ATGGAACTCCTGCCATTGGCCATGCTAATGTACGATGGAACCCGGGTTACTGCGATGC 
AGTATTTAATTCCGGGTCTACCGCTTGAGAACAATTATTGCTACGTAGTCACGTACAT 
GATTCAGACGGTGACAATGCTCGTGCAAGGAGTCGGATTCTACTCCGGTGATTTGTTC 
5 GTATTTCTCGGCTTAACGCAGATCCTAACTTTCGCCGATATGCTGCAGGTGAAGGTGA 
AAGAGCTAAACGATGCCCTGGAACAAAAAGCGGAATACAGAGCTCTAGTCCGAGTTGG 
AGCTTCTATTGATGGAGCGGAAAATCGTCAACGCCTTCTCTTGGATGTTATAAGATGG 
CATCAATTATTCACGGACTACTGTCGCGCCATAAATGCCCTCTACTACGAATTGATCG 
CCACTCAGGTTCTTTCGATGGCTTTGGCCATGATGCTCAGCTTCTGCATTAATTTGAG 

1 0 CAGCTTTCACATGCCTTCGGCTATCTTTTTCGTGGTTTCTGCCTACAGCATGTCCATC 
TATTGCATTCTGGGCACCATTCTTGAGTTTGCATATGACCAGGTGTACGAGAGCATCT 
GTAATGTGACCTGGTATGAGTTGAGTGGCGAACAGCGAAAGCTTTTTGGTTTTTTGTT 
GCGGGAATCCCAGTATCCGCACAATATTCAGATACTTGGAGTTATGTCGCTTTCCGTG 
AGAACGGCTCTGCAGATTGTTAAACTAATTTATAGCGTATCCATGATGATGATGAATC 

15 GGGCG 



DOR117 

MDLRRWFPTLYTQSKDSPVRSRDATLYLLRCVFLMGVRKPPAKFFVAYVLWSFALNFC 

2 0 STFYQPIGFLTGYISHLSEFSPGEFLTSLQVAFNAWSCSTKVLIWALVKRFDEANNL 

LDEMDRRITDPGERLQIHRAVSLSNRIFFFFMAVYMVYATNTFLSAIFIGRPPYQNYY 
PFLDWRSSTLHLALQAGLEYFAMAGACFQDVCVDCYPVNFVLVLRAHMSIFAERLRRL 
GTYPYESQEQKYERLVQCIQDHKVI LRFVDCLRPVI SGTI FVQFLWGLVLGFTLINI 
VLFANLGSAIAALSFMAAVLIiETTPFCILCNYLTEDCYKLADALFQSNWIDEEKRYQK 
25 TLMYFLQKLQQPITFMAMNVFPI SVGTNI SVSRCAL 

DQR117nt 

ATGGATCTGCGAAGGTGGTTTCCGACCTTGTACACCCAGTCGAAGGATTCGCCAGTTC 
GCTCCCGAGACGCGACCCTGTACCTCCTACGCTGCGTCTTCTTAATGGGCGTCCGCAA 

3 0 GCCACCTGCCAAGTTTTTCGTGGCCTACGTGCTCTGGTCCTTCGCACTGAATTTCTGC 

TCAACATTTTATCAGCCAATTGGCTTTCTCACAGGCTATATAAGCCATTTATCAGAGT 
TCTCCCCGGGAGAGTTTCTAACTTCGCTGCAGGTGGCCTTTAATGCTTGGTCCTGCTC 
TACAAAAGTCCTGATAGTGTGGGCACTAGTTAAGCGCTTTGACGAGGCTAATAACCTT 
CTCGACGAGATGGATAGGCGTATCACAGACCCCGGAGAGCGTCTTCAGATTCATCGCG 
3 5 CTGTCTCCCTCAGTAACCGTATATTCTTCTTTTTCATGGCAGTCTACATGGTTTATGC 
CACTAATACGTTTCTGTCGGCGATCTTCATTGGAAGGCCACCGTACCAAAATTACTAC 
CCTTTTCTGGACTGGCGATCTAGCACTCTGCATCTAGCTCTGCAGGCCGGTCTGGAAT 
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ACTTCGCCATGGCTGGCGCCTGCTTCCAGGACGTTTGCGTTGATTGCTACCCAGTCAA 
TTTCGTTTTGGTCCTGCGTGCCCACATGTCGATCTTCGCGGAGCGCCTTCGACGTTTG 
GGAACTTATCCTTATGAAAGCCAGGAGCAGAAATATGAACGATTGGTTCAGTGCATAC 
AAGATCACAAAGTAATTTTGCGATTTGTTGACTGCCTGCGTCCTGTTATTTCTGGTAC 
5 CATCTTCGTGCAATTCTTGGTTGTGGGGTTGGTGCTGGGCTTTACCCTAATTAACATT 
GTCCTGTTCGCCAACTTGGGATCGGCCATCGCAGCGCTCTCGTTTATGGCCGCAGTGC 
TTCTAGAGACGACTCCCTTCTGCATATTGTGCAATTATCTCACAGAAGACTGCTACAA 
GCTGGCCGATGCCCTGTTTCAGTCAAACTGGATTGATGAGGAGAAACGATACCAAAAG 
ACACTCATGTACTTCCTACAGAAACTGCAGCAGCCTATAACCTTCATGGCTATGAACG 
1 0 TGTTTCCAATATCTGTGGGAACTAACATCAGTGTAAGCAGATGTGCCCTT 



DOR118 

1 5 MKF I GWLP PKQGVLRYVYLTWTLMTFVWCTTYLPLGFL.GS YMTQ I KS FS PGEFLTS LQ 
VCINAYGSSVKVAITYSMLWRLIKAKNILDQLDLRCTAMEEREKIHLWARSNHAFLI 
FTFVYCGYAGSTYLSSVLSGRPPWQLYNPFIDWHDGTLKLWVASTLEYMVMSGAVLQD 
QLS DSYPLI YTLI LRAHLDMLRERI RRLRSDENLSEAE S YEELVKCVMDHKLI LR YCA 
IIKPVIQGTIFTQFLLIGLVLGFTLINVFFFSDIWTGIASFMFVITILLQTFPFCYTC 

2 0 NLIMEDCESLTHAIFQSNWVDASRRYKTTLLYFLQNVQQPIVFIAGGIFQISMSSNIS 
VAKFAFS VI TI TKQMNIADKFKTD 

DOR13L8pt: 

ATGAAGTTTATTGGATGGCTGCCCCCCAAGCAGGGTGTGCTCCGGTATGTGTACCTCA 

2 5 CCTGGACGCTAATGACGTTCGTGTGGTGTACAACGTACCTGCCGCTTGGCTTCCTTGG 

TAGCTACATGACGCAGATCAAGTCCTTCTCCCCTGGAGAGTTTCTCACTTCACTCCAG 
GTGTGCATTAATGCCTACGGCTCATCGGTAAAAGTTGCAATCACATACTCCATGCTCT 
GGCGCCTTATCAAGGCCAAGAACATTTTGGACCAGCTGGACCTGCGCTGCACCGCCAT 
GGAGGAGCGCGAAAAGATCCACCTAGTGGTGGCCCGCAGCAACCATGCCTTTCTCATC 

3 0 TTCACCTTTGTCTACTGCGGATATGCCGGCTCCACCTACCTGAGCTCGGTTCTCAGCG 

GGCGTCCGCCCTGGCAGCTGTACAATCCCTTTATTGATTGGCATGACGGCACACTCAA 
GCTCTGGGTGGCCTCCACGTTGGAGTACATGGTGATGTCAGGCGCCGTTCTGCAGGAT 
CAACTCTCGGACTCTTACCCATTGATCTATACCCTCATCCTTCGTGCTCACTTGGACA 
TGCTAAGGGAGCGCATCCGACGCCTCCGTTCCGATGAGAACCTGAGCGAGGCCGAGAG 
3 5 CTATGAAGAGCTGGTCAAATGTGTGATGGACCACAAGCTCATTCTAAGATACTGCGCG 
ATTATTAAACCAGTAATCCAGGGGACCATCTTCACACAGTTTCTGCTGATCGGCCTGG 
TTCTGGGCTTCACGCTGATCAACGTGTTTTTCTTCTCAGACATCTGGACGGGCATCGC 
ATCATTTATGTTTGTTATAACCATTTTGCTGCAGACCTTCCCCTTCTGCTACACATGC 
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AACCTCATCATGGAGGACTGCGAGTCCTTGACCCATGCTATTTTCCAGTCCAACTGGG 
TGGATGCCAGTCGTCGCTACAAAACAACACTACTGTATTTTCTCCAAAACGTGCAGCA 
GCCTATCGTTTTCATTGCAGGCGGTATCTTTCAGATATCCATGAGCAGCAACATAAGT 
GTGGCAAAGTTTGCTTTCTCCGTGATAACCATTACCAAGCAAATGAATATAGCTGACA 
5 AATTTAAGACGGAC 

DOR119 

MAVFKLIKPAPLTEKVQSRQGNIYLYRAMWLIGWIPPKEGVLRYVYLFWTCVPFAFGV 
FYLPVGFI I SYVQEFKNFTPGEFLTSLQVCINVYGASVKSTITYLFLWRLRKTEILLD 

1 0 SLDKRLANDSDRERIHNMVARCNYAFLI YSFI YCGYAGSTFLSYALSGRPPWSVYNPF 

IDWRDGMGSLWIQAIFEYITMSFAVLQDOLSDTYPLMFTIMFRAHMEVLKDHVRSLRM 
D PERS EADNYQDLVNCVLDHKT I LKCCDMI RPMI SRTI FVQFAL IGS VLGLTLVNVFF 
F SNF WKGVAS LLFVI T I LLQTF P FC YTCNML I DDAQDLSNE I FQ SNWVDAEPRYKATL 
VLFMHHVQQP 1 1 FI AGGI FP I SMNSNITVAKFAFS 1 1 TI VRQMNLAEQFQATGGCGGT 

1 5 GTTCAAGCTAATCAAACCGGCTCCGTTGACCGAGAAGGTGCAGTCCCGCCAGGGGAAT 
ATATATCTGTACCGTGCCATGTGGCTCATCGGATGGATTCCGCCGAAGGAGGGAGTCC 
TGCGCTACGTGTATCTCTTCTGGACCTGCGTGCCCTTCGCCTTCGGGGTGTTTTACCT 
GCCCGTGGGCTTCATCATCAGCTACGTGCAGGAGTTCAAGAACTTCACGCCGGGCGAG 
TTCCTTACCTCGCTGCAGGTGTGCATCAATGTGTATGGCGCCTCGGTGAAGTCCACCA 

2 0 TCACCTACCTCTTCCTCTGGCGACTGCGCAAGACGGAGATCCTTCTGGACTCCCTGGA 
CAAGAGGCTGGCGAACGACAGCGATCGCGAGAGGATCCACAATATGGTGGCGCGCTGC 
AACTACGCCTTTCTCATCTACAGCTTCATCTACTGCGGATACGCGGGTTCCACTTTCC 
TGTCCTACGCCCTCAGTGGTCGTCCTCCGTGGTCCGTCTACAATCCCTTCATCGATTG 
GCGCGATGGCATGGGCAGCCTGTGGATCCAGGCCATATTCGAGTACATCACCATGTCC 

2 5 TTCGCCGTGCTGCAGGACCAGCTATCCGACACGTATCCCCTGATGTTCACCATTATGT 

TCCGGGCCCACATGGAGGTCCTCAAGGATCACGTGCGGAGCCTGCGCATGGATCCCGA 
GCGCAGTGAGGCAGACAACTATCAGGATCTGGTGAACTGCGTGCTGGACCACAAGACT 
ATACTGAAATGCTGTGACATGATTCGCCCCATGATATCCCGCACCATCTTCGTGCAAT 
TCGCGCTGATTGGTTCCGTTTTGGGCCTGACCCTGGTGAACGTGTTCTTCTTCTCGAA 

3 0 CTTCTGGAAGGGCGTGGCCTCGCTCCTGTTCGTCATCACCATCCTGCTGCAGACCTTC 

CCGTTCTGCTACACCTGCAACATGCTGATCGACGATGCCCAGGATCTGTCCAACGAGA 
TTTTCCAGTCCAACTGGGTGGACGCGGAGCCGCGCTACAAGGCGACGCTGGTGCTCTT 
CATGCACCATGTTCAGCAGCCCATAATCTTCATTGCCGGAGGCATCTTTCCCATCTCT 
ATGAACAGCAACATAACCGTGGCCAAGTTCGCCTTCAGCATCATTACAATAGTGCGAC 
3 5 AAATGAATCTGGCCGAGCAGTTCCAG 
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MTKFFFKRLQTAPLDQEVSSLDASDYYYRIAFFLGWTPPKGALLRWIYSLWTLTTMWL 
GIVYLPLGLSLTYVKHFDRFTPTEFLTSLQVDINCIGNVIKSCVTYSQMWRFRRMNEL 
ISSLDKRCVTTTQRRIFHKMVARVNLIVILFLSTYLGFCFLTLFTSVFAGKAPWQLYN 
5 PLVDWRKGHWQLWIASILEYCWSIGTMQELMSDTYAIVFISLFRCHLAILRDRIANL 
RQDPKLSEMEHYEQMVACIQDHRTIIQCSQI IRPILSITI FAQFMLVG I DLGLAAI S I 
LFFPNTIWTIMANVSFIVAICTESFPCCMLCEHLIEDSVHVSNALFHSNWITADRSYK 
SAVLYFLHRAQQPIQFTAGSTFPISVQSNIAVAKFAFTIITIVNQMNLGEKFFSDRSN 
GDINP 

10 

DOR120nt 

ATGACCAAGTTCTTCTTCAAGCGCCTGCAAACTGCTCCACTTGATCAGGAGGTGAGTT 
CCCTTGATGCCAGCGACTACTACTACCGCATCGCATTTTTCCTGGGCTGGACCCCGCC 
CAAGGGGGCTCTGCTCCGATGGATCTACTCCCTGTGGACTCTGACCACGATGTGGCTG 

1 5 GGTATCGTGTACCTGCCGCTCGGACTGAGCCTCACCTATGTGAAGCACTTCGATAGAT 
TCACGCCGACGGAGTTCCTGACCTCCCTGCAGGTGGATATCAACTGCATCGGGAACGT 
GATCAAGTCATGCGTAACTTATTCCCAGATGTGGCGTTTTCGCCGGATGAATGAGCTT 
ATCTCGTCCCTGGACAAGAGATGTGTGACTACGACACAGCGTCGAATTTTCCATAAGA 
TGGTGGCACGGGTTAATCTCATCGTGATTCTGTTCTTGTCCACGTACTTGGGCTTCTG 

2 0 CTTTCTAACTCTGTTCACTTCGGTTTTCGCTGGCAAAGCTCCTTGGCAGCTGTACAAC 
CCACTGGTGGACTGGCGGAAAGGCCATTGGCAGCTATGGATTGCCTCCATCCTGGAGT 
ACTGTGTGGTCTCCATTGGCACCATGCAGGAGTTGATGTCCGACACCTACGCCATAGT 
GTTCATCTCCTTGTTCCGCTGCCACCTGGCTATTCTCAGAGATCGCATAGCTAATCTG 
CGGCAGGATCCGAAACTCAGTGAGATGGAACACTATGAGCAGATGGTGGCCTGCATTC 

2 5 AGGATCATCGAACCATCATACAGTGCTCCCAGATTATTCGACCCATCCTGTCGATCAC 

TATCTTTG C C CAGTTCATGCTGGTTGGCATTGACTTGGGTCTGGCGGCCATCAG CATC 
CTCTTCTTTCCGAACACCATTTGGACGATCATGGCAAACGTGTCGTTCATCGTGGCCA 
TCTGTACAGAGTCCTTTCCATGCTGCATGCTCTGCGAGCATCTGATCGAGGACTCCGT 
CCATGTGAGCAACGCCCTGTTCCACTCAAACTGGATAACCGCGGACAGGAGCTACAAG 

3 0 TCGGCGGTTCTGTATTTCCTGCACCGGGCTCAGCAACCCATTCAATTCACGGCCGGCT 

CCATATTTCCCATTTCGGTGCAGAGCAACATAGCCGTGGCCAAGTTCGCGTTCACAAT 
CATCACAATCGTGAACCAAATGAATCTGGGCGAGAAGTTCTTCAGTGACAGGAGCAAT 
GGCGATATAAATCCT 

35 DOR121 

MLTDKFLRLQSALFRLLGLELLHEQDVGHRYPWRSICCILSVASFMPLTIAFGLQNVQ 
NVEQLTDS LC S VLVDLLALCKI GLFLWL YKDFKFL I GQFYCVLQTETHTAVAEM I VTR 
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E S RRDQF I S AMYAYCF I TAGLS ACLMS PLSML I S YHEQVNC SRNFHF PVCKKKYCL I S 
RILRYSFCRYPWDlWKLSNYIISYFWtJV'CAALGVALPTVCVDTLFCSLSHNLCAIjFQI 
ARHKIWHFEGFJTrKETHENLKHVFQLYALCLNLGHFLNEYFRPLICQFVAASLHLCVL 
CYQLSANILQPALLFYAAFTAAWGQVSIYCFCGSSIHSECQLFGQAIYESSWPHLLQ 
5 ENLQLVS S LKIAMMRS SLGCP I DGYFFEANRETL I TVS KAF I KVSKKTPQVND 

DOR121 

ATGCTGACGGACAAGTTCCTCCGACTGCAGTCCGCTTTATTTCGCCTTCTCGGACTCG 
AATTGTTGCACGAGCAGGATGTTGGCCATCGATATCCTTGGCGCAGCATCTGCTGCAT 

1 0 TCTCTCGGTGGCCAGTTTCATGCCCCTGACCATTGCGTTTGGCCTGCAAAACGTCCAA 
AATGTGGAGCAATTAACCGACTCACTCTGCTCGGTTCTCGTGGATTTGCTGGCCCTGT 
GCAAAATCGGGCTTTTCCTTTGGCTTTACAAGGACTTCAAGTTCCTAATAGGGCAGTT 
CTATTGTGTTTTGCAAACGGAAACCCACACCGCTGTCGCTGAAATGATAGTGACCAGG 
GAAAGTCGTCGGGATCAGTTCATCAGTGCTATGTATGCCTACTGTTTCATTACGGCTG 

1 5 GCCTTTCGGCCTGCCTGATGTCCCCTCTATCCATGCTGATTAGCTACCACGAACAGGT 
GAATTGCAGCCGAAATTTCCATTTCCCAGTGTGTAAGAAAAAGTACTGCTTAATATCC 
AGAATATTAAGATACAGTTTCTGCAGATATCCCTGGGACAATATGAAGCTGTCCAACT 
ACATCATTTCCTATTTCTGGAATGTGTGTGCTGCATTGGGCGTGGCACTGCCCACCGT 
TTGTGTGGACACACTGTTCTGTTCTCTGAGCCATAATCTCTGTGCCCTATTCCAGATT 

2 0 GC C AGGCACAAAATGATGCAC TTTGAGGGCAGAAATAC CAAAGAGACTCATGACAACT 
TAAAGCACGTGTTTCAACTATATGCGTTGTGTTTGAACCTGGGCCATTTCTTAAACGA 
ATATTTCAGACCGCTCATCTGCCAGTTTGTGGCAGCCTCACTGCACTTGTGTGTCCTG 
TGCTACCAACTGTCTGCCAATATCCTGCAGCCAGCGTTACTCTTCTATGCCGCATTTA 
CGGCAGCAGTTGTTGGCCAGGTGTCTATATACTGCTTCTGCGGATCGAGCATCCATTC 

2 5 GGAGTGTCAGCTATTTGGCCAGGCCATCTACGAGTCCAGCTGGCCCCATCTGCTGCAG 

GAAAACCTGCAGCTTGTAAGCTCCTTAAAAATTGCCATGATGCGATCGAGTTTGGGAT 
GTCCCATCGATGGTTACTTCTTCGAGGCCAATCGGGAGACGCTCATCACGGTGAGTAA 
AGCGTTTATAAAAGTGTCCAAAAAGACACCTCAAGTGAATGAT 

30 DOR14 

MD YDR I R PVR FLTGVLKWWRLW PRKE S VS T PD WTNWQAYALHVP FTFLFVLLLWLEAI 
KSRDIQHTADVLLICLTTTALGGKVINIWKYAHVAQGILSEWSTWDLFELRSKQEVDM 
WRFEHRRFNRVFMF YCLC S AGV I P F I V I Q PLFD I PNRLP FWMWT P FDWQQ PVLFWYAF 
IYQATTIPIACAClTVimDAVlWYLMLHLSLCLRMLGQRLSKLQHDDKDLREKFLELIH 

3 5 LHQRLKQQALSIEIFISKSTFTQILVSSLIICFTIYSMQMDLPGFAAMMQYLVAMIMQ 

VMLPTIYGNAVIDSANMLTDSMYNSDWPDMNCR>KRLVLMFMWLmPVTLKAGGFFH 
IGLPLFTKWFSTLENPCISYLYFRP 
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DOR14nt 

ATGGACTACGATCGAATTCGACCGGTGCGATTTTTGACGGGAGTGCTGAAATGGTGGC 
GTCTCTGGCCGAGGAAGGAATCGGTGTCCACACCGGACTGGACTAACTGGCAGGCATA 
TGCCTTGCACGTTCCATTTACATTCTTGTTTGTGTTGCTTTTGTGGTTGGAGGCAATC 
5 AAGAGCAGGGATATACAGCATACCGCCGATGTCCTTTTGATTTGCCTAACCACCACTG 
CCTTGGGAGGTAAAGTTATCAATATCTGGAAGTATGCCCATGTGGCCCAAGGCATTTT 
GTCCGAGTGGAGCACGTGGGATCTTTTCGAGCTGAGGAGCAAACAGGAAGTGGATATG 
TGGCGATTCGAGCATCGACGTTTCAATCGTGTTTTTATGTTTTACTGTTTGTGCAGTG 
CTGGTGTAATCCCATTTATTGTGATTCAACCGTTGTTTGATATCCCAAATCGATTGCC 

1 0 CTTCTGGATGTGGACACCATTCGATTGGCAGCAGCCTGTTCTCTTCTGGTATGCATTC 
ATCTATCAGGCCACAACCATTCCTATTGCCTGTGCTTGCAACGTAACCATGGACGCTG 
TTAATTGGTACTTGATGCTGCATCTGTCCTTGTGTTTGCGTATGTTGGGCCAGCGATT 
GAGTAAGCTTCAGCATGATGACAAGGATCTGAGGGAGAAGTTCCTGGAACTGATCCAT 
CTGCACCAGCGACTCAAGCAACAGGCCTTGAGCATTGAAATCTTTATTTCGAAGAGCA 

1 5 CGTTCACCCAAATTCTGGTCAGTTCCCTTATCATTTGCTTCACCATTTACAGCATGCA 
GATGGACTTGCCAGGATTTGCCGCCATGATGCAGTACCTAGTGGCCATGATCATGCAG 
GTCATGCTGCCCACCATATATGGTAACGCCGTCATCGATTCTGCAAATATGTTGACCG 
ATTCCATGTACAATTCGGATTGGCCGGATATGAATTGCCGAATGCGTCGCCTAGTTTT 

2 0 ATTGGTTTACCTCTGTTTACCAAGGTTGTATTTTCTACTCTGGAAAATCCTTGTATAA 
GTTATCTTTATTTCAGACCA 

DOR16 

MTDSGQPAIADHFYRIPRISGLIVGLWPQRIRGGGGRPWHAHLLFVFAFAMWVGAVG 

2 5 EVS YGCVHLDNLWALEAFCPGTTKAVCVLKLWVFFRSNRRWAELVQRLRAILWESRR 

QEAQRMLVGLATTANRLSLLLLSSGTATNAAFTLQPLIMGLYRWIVQLPGQTELPFNI 
I L PS FAVQ PGVF PLTYVX.LTAS GACTVFAFS FVDGFF I CS CLYI CGAFRLVQQD I RRI 
FADLHGDS VD VFTEEMNAEVRHRLAQWERHNAI I DFCTDLTRQFTVI VLMHFLSAAF 
VLCSTILDIMLVSPFSEAFLWGGYPWVCRATGFSHRLHSAAVLKVFPCFHCLLFFPGF 

3 0 S SRSVLI RFSRFVCLLCGCGCGSLRWQF I SA 

DOR16nt 

ATGACTGACAGCGGGCAGCCTGCCATTGCCGACCACTTTTATCGGATTCCCCGCATCT 
CCGGCCTCATTGTCGGCCTCTGGCCGCAAAGGATAAGGGGCGGGGGCGGTCGTCCTTG 
3 5 GCACGCCCATCTGCTCTTCGTGTTCGCCTTCGCCATGGTGGTGGTGGGTGCGGTGGGC 
GAGGTGTCGTACGGCTGTGTCCACCTGGACAACCTGGTGGTGGCGCTGGAGGCCTTCT 
GCCCCGGAACCACCAAGGCGGTCTGCGTTTTGAAGCTGTGGGTCTTCTTCCGCTCCAA 
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TCGCCGGTGGGCGGAGTTGGTCCAGCGCCTGCGGGCTATTTTGTGGGAATCGCGGCGG 
CAGGAGGCCCAGAGGATGCTGGTCGGACTGGCCACCACGGCCAACAGGCTCAGCCTGT 
TGTTGCTCAGCTCTGGCACGGCGACAAATGCCGCCTTCACCTTGCAACCGCTGATTAT 
GGGTCTCTACCGCTGGATTGTGCAGCTGCCAGGTCAAACCGAGCTGCCCTTTAATATC 
5 ATACTGCCCTCGTTTGCCGTGCAGCCAGGAGTCTTTCCGCTCACCTACGTGCTGCTGA 
CCGCTTCCGGTGCCTGCACCGTTTTCGCCTTCAGCTTCGTGGACGGATTCTTCATTTG 
CTCGTGCCTCTACATCTGCGGCGCTTTCCGGCTGGTGCAGCAGGACATTCGCAGGATA 
TTTGCCGATTTGCATGGCGACTCAGTGGATGTGTTCACCGAGGAGATGAACGCGGAGG 
TGCGGCACAGACTGGCCCAAGTTGTCGAGCGGCACAATGCGATTATCGATTTCTGCAC 

1 0 GGACCTAACACGCCAGTTCACCGTTATCGTTTTAATGCATTTCCTGTCCGCCGCCTTC 
GTCCTCTGCTCGACCATCCTGGACATCATGTTGGTGAGCCCCTTTTCAGAGGCCTTCC 
TTTGGGGCGGGTATCCTTGGGTTTGTCGCGCCACTGGCTTTTCGCATCGCCTGCATTC 
GGCGGCTGTTTTAAAAGTTTTTCCCTGTTTTCACTGTTTGCTGTTTTTCCCTGGCTTT 
TCCAGCCGCTCCGTTCTGATTCGGTTTTCCCGATTTGTTTGTTTGCTTTGTGGCTGCG 

1 5 GCTGCGGCTCTCTCCGGTGGCAATTTATAAGCGCATGA 

DOR19 

MVTEDFYKYQVWYFQILGWQLPTWAADHQRRFQSMRFGFILVILFIMLLLFSFEMLN 
NI S QVRE I LKVFFMFATE I S CMAKLLHLKLKS RKLAGLVDAMLS PEFGVKS EQEMQML 

2 0 ELDRVA WRMRNS YG I MS LGAASL I L I VP CFDNFGEL PLAMLEVCS I E GWI CYWSQYL 

FHSICLLPTCVLNITYDSVAYSLLCFLKVQLQMLVLRLEKLGPVIEPQDNEKIAMELR 
ECAAYYNRI VRFKDLVELF I KGPG S VQLMCS VLVLVSNLYDMSTMS I ANGDAI FMLKT 
C I YQLVMLWQ I F 1 1 C YASNEVTVQS SRLCHS I YS SQWTGWNRANRRI VLLMMQRFNS P 
MLLSTFNPTFAFSLEAFGS VGQQKFLYI S FI TG YALLLSDRQLLLQLLRTAEARQQLN 
25 FET PQHLKI F KP I FKS TQNVMHVH 

DORl&nt 

ATGGTTACGGAGGACTTTTATAAGTACCAGGTGTGGTACTTCCAAATCCTTGGTGTTT 
GGCAGCTCCCCACTTGGGCCGCAGACCACCAGCGTCGTTTTCAGTCCATGAGGTTTGG 

3 0 CTTCATCCTGGTCATCCTGTTCATCATGCTGCTGCTTTTCTCCTTCGAAATGTTGAAC 

AACATTTCCCAAGTTAGGGAGATCCTAAAGGTATTCTTCATGTTCGCCACGGAAATAT 
CCTGCATGGCCAAATTATTGCATTTGAAGTTGAAGAGCCGCAAACTCGCTGGCTTGGT 
TGATGCGATGTTGTCCCCAGAGTTCGGCGTTAAAAGTGAACAGGAAATGCAGATGCTG 
GAATTGGATAGAGTGGCGGTTGTCCGCATGAGGAACTCCTACGGCATCATGTCCCTGG 
3 5 GCGCGGCTTCCCTGATCCTTATAGTTCCCTGTTTCGACAACTTTGGCGAGCTACCACT 
GGCCATGTTGGAGGTATGCAGCATCGAGGGATGGATCTGCTATTGGTCGCAGTACCTT 
TTCCACTCGATTTGCCTGCTGCCCACTTGTGTGCTGAATATAACCTACGACTCGGTGG 
CCTACTCGTTGCTCTGTTTCTTGAAGGTTCAGCTACAAATGCTGGTCCTGCGATTAGA 
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AAAGTTGGGTCCTGTGATCGAACCCCAGGATAATGAGAAAATCGCAATGGAACTGCGT 
GAGTGTGCCGCCTACTACAACAGGATTGTTCGTTTCAAGGACCTGGTGGAGCTGTTCA 
TAAAGGGGCCAGGATCTGTGCAGCTCATGTGTTCTGTTCTGGTGCTGGTGTCCAACCT 
GTACGACATGTCCACCATGTCCATTGCAAACGGCGATGCCATCTTTATGCTCAAGACC 
5 TGTATCTATCAGCTGGTGATGCTCTGGCAGATCTTCATCATTTGCTACGCCTCCAACG 
AGGTAACTGTCCAGAGCTCTAGGTTGTGTCACAGCATCTACAGCTCCCAATGGACGGG 
ATGGAACAGGGCAAACCGCCGGATTGTCCTTCTCATGATGCAGCGCTTTAATTCCCCG 
ATGCTCCTGAGCACCTTTAACCCCACCTTTGCTTTCAGCTTGGAGGCCTTTGGTTCTG 
TAGGGCAGCAGAAATTCCTTTATATATCATTTATTACTGGTTATGCTCTTCTCCTTTC 
1 0 AGATCGTCAACTGCTCCTACAGCTACTTCGCACTGCTGAAGCGCGTCAACAGTTAAAT 
TTCGAAACACCGCAGCACCTAAAGATTTTCAAGCCGATTTTTAAAAGCACTCAAAACG 
TTATGCACGTACAT 

DOR20 

15 MSKGVEIFYKGQKAFLNILSLWPQIERRWRIIHQVNYVHVIVFWVLLFDLLLVLHVm 
NL S YMS E WKAI F I LATS AGHTTKLLS I KANNVQMEELFRRLDNEE FRPRGANEEL I F 
AAACERSRKLRDFYGALSFAALSMILIPQFALDWSHLPLKTYNPLGENTGSPAYWLLY 
CYQCLALSVSCITNIGFDSLCSSLFIFLKCQLDILAVRLDKIGRLITTSGGTVEQQLK 
EN I R YHMT I VELS KTVERLLCKP I S VQ I FCS VLVLTANF YAI AWS CE FATRRLS VCD 

2 0 LSGVHVDSDFYI VLLCRVGI PYPKCLPRPVMNFIVSEVTQRSLDLPHELYKTSWVDWD 
YRSRRIALLFMQRLHSTLRIRTliNPSLGFDLMLFSSVSSFRVLTFLCTVANFHNEAH 
DOR20nt 

ATGAGCAAAGGAGTAGAAATCTTTTACAAGGGCCAGAAGGCATTCTTGAACATCCTCT 
CGTTGTGGCCTCAGATAGAACGCCGGTGGAGAATCATCCACCAGGTGAACTATGTCCA 

2 5 CGTAATTGTGTTTTGGGTGCTGCTCTTTGATCTCCTCTTGGTGCTC CATGTGATGGCT 

AATTTGAGCTACATGTCCGAGGTTGTGAAAGCCATCTTTATCCTGGCCACCAGTGCAG 
GGCACACCACCAAGCTGCTGTCCATAAAGGCGAACAATGTGCAGATGGAGGAGCTCTT 
TAGGAGATTGGATAACGAAGAGTTCCGTCCTAGAGGCGCCAACGAAGAGTTGATCTTT 
GCAGCAGCCTGTGAAAGAAGTAGGAAGCTTCGGGACTTCTATGGAGCGCTTTCGTTTG 

3 0 CCGCCTTGAGCATGATTCTCATACCCCAGTTCGCCTTGGACTGGTCCCACCTTCCGCT 

CAAAACATACAATCCGCTTGGCGAGAATACCGGCTCACCTGCTTATTGGCTCCTCTAC 
TGCTATCAGTGTCTGGCCTTGTCCGTATCCTGCATCACCAACATAGGATTCGACTCAC 
TCTGCTCCTCACTGTTCATCTTCCTCAAGTGCCAGCTGGACATTCTGGCCGTGCGACT 
GGACAAGATCGGTCGGTTAATCACTACTTCTGGTGGCACTGTGGAACAGCAACTTAAG 
3 5 GAAAATATCCGCTATCACATGACCATCGTTGAACTGTCGAAAACCGTGGAGCGTCTAC 
TTTGCAAGCCGATTTCGGTGCAGATCTTCTGCTCGGTTTTGGTGCTGACTGCCAATTT 
CTATGCCATTGCTGTGGTGAGCTGTGAATTCGCAACAAGAAGACTATCAGTATGTGAC 
CTATCAGGCGTGCATGTTGATTCAGATTTTTATATTGTGCTACTATGCCGGGTGGGTA 
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TTCCATATCCGAAATGCCTCCCCAGGCCAGTAATGAATTTCATCGTCAGTGAGGTAAC 
CCAGCGCAGCCTGGACCTTCCGCACGAGCTGTACAAGACCTCCTGGGTGGACTGGGAC 
TACAGGAGCCGAAGGATTGCGCTCCTCTTTATGCAACGCCTTCACTCGACCTTGAGGA 
TTAGGACACTTAATCCAAGTCTTGGTTTTGACTTAATGCTCTTCAGCTCGGTGAGTTC 
5 TTTCCGTGTTTTGACTTTTTTGTGCACTGTAGCCAATTTCCATAATGAGGCTCAT 

DOR24 

MD S FLQ VQKST I ALLG FDLF S ENREMWKRP YRAMNVFS I AAI FPF I LAAVLHNWKNVL 
LIADAMVALLITILGLFKFSMILYLRRDFKRLIDKFRLLMSMEAEQGEEYAEILNAAN 
1 0 KQDQRMCTLFRTCFLLAWALNSVLPLVRMGLSYWLAGHAEPELPFPCLFPWNIHI IRN 
YVLSFIWSAFASTGWLPAVSLDTIFCSFTSNLCAFFKIAQYKWRFKGGSLKESQAT 
LNKVFALYQTSLDMCNDLNQCYQPI I CAQFF I S SLQLCMLG YLFS I TFAQTEGVYYAS 
FIATI I IQAYIYCYCGENLKTESASFEWAIYDSPWHESLGAGGASTSICRSLLI SMMR 
AHRGFRITGYFFEANMEAFSS IVRTAMSYITMLRSFS 

15 

DOR24nt 

GGCACGAGCCTTGTCGACATGGACAGTTTTCTGCAAGTACAGAAGAGCACCATTGCTC 
TTCTGGGCTTTGATCTCTTTAGTGAAAATCGAGAAATGTGGAAACGCCCCTATAGAGC 
AATGAATGTGTTTAGCATAGCTGCCATTTTTCCCTTTATCCTGGCAGCTGTGCTCCAT 
2 0 AATTGGAAGAATGTATTGCTGCTGGCCGATGCCATGGTGGCCCTACTAATAACCATTC 
TGGGCCTATTCAAGTTTAGCATGATACTTTACTTACGTCGCGATTTCAAGCGACTGAT 
TGACAAATTTCGTTTGCTCATGTCGAATGAGGCGGAACAGGGCGAGGAATACGCCGAG 
ATTCTCAACGCAGCAAACAAGCAGGATCAACGAATGTGCACTCTGTTTAGGACTTGTT 
TCCTCCTCGCCTGGGCCTTGAATAGTGTTCTGCCCCTCGTGAGAATGGGTCTCAGCTA 

2 5 TTGGTTAGCAGGTCATGCAGAGCCCGAGTTGCCTTTTCCCTGTCTTTTTCCCTGGAAT 

ATCCACATCATTCGCAATTATGTTTTGAGCTTCATCTGGAGCGCTTTCGCCTCGACAG 
GTGTGGTTTTACCTGCTGTCAGCTTGGATACCATATTCTGTTCCTTCACCAGCAACCT 
GTGCGCCTTCTTCAAAATTGCGCAGTACAAGGTGGTTAGATTTAAGGGCGGATCCCTT 
AAAGAATCACAGGCCACATTGAACAAAGTCTTTGCCCTGTACCAGACCAGCTTGGATA 
•3 0 TGTGCAACGATCTGAATCAGTGCTACCAACCGATTATCTGCGCCCAGTTCTTCATTTC 
AT CT CTGCAACTCTGCATGCTGGGATATCTGTT CTCCATTACTTTTGC CCAGACAGAG 
GGCGTGTACTATGCCTCTTTCATAGCCACCATCATTATACAAGCCTATATCTACTGCT 
ACTGCGGGGAGAACCTGAAGACGGAGAGTGCCAGCTTCGAGTGGGCCATCTACGACAG 
TCCGTGGCACGAGAGTTTGGGTGCTGGTGGAGCCTCTACCTCGATCTGCCGATCCTTG 

3 5 CTGATCAGCATGATGCGGGCTCATCGGGGATTCCGCATTACGGGATACTTCTTCGAGG 

CAAACATGGAGGCCTTCTCATCGATTGTTCGCACGGCTATGTCCTACATCACAATGCT 
GAGATCATTCTCCTAAATGTGGTTTGACCACAAGGCTTTGGATTGATTTTTGTGCAAT 
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TTTTGTTTTATTGCTGAGCATGCGTTGCCGTACGACATTTAACAATCGATCTTACGTA 
ATTTACATATGATAATCTCACATATTGTTCGTTAAGCACTAAGTAGAATGTAGAATGT 
GAATTGGCTGTAGAAATGCACAGATGAAGCACGAAAAAAAAAAAAAAAAAAAAAAAA 

DOR2 5 

MNDSGYQSNLSLLRVFLDEFRSVLRQESPGLIPRLAFYYVRAFLSLPLYRWINLFIMC 
NVMTI FWTMFVALPESKNVIEMGDDLVWI SGMALVFTKI FYMHLRCDEIDELI SDFEY 
YNRELRPHNIDEEVLGWQRLCYVIESGLYINCFCLVNFFSAAIFLQPLLGEGKLPFHS 
VYPFQWHRLDLHPYTFWFLYIWQSLTSQHNLMSILMVDMVGISTFLQTALNLKLLCIE 
I RKLGDMEVSDKRFHEEFCRWRFHQHI I KLVGKANRAFNGAFNAQLMASFSLI S 1ST 
FETMAAAAVDPKMAAKFVLLMLVAFIQLSLWCVSGTLVYTQSVEVAQAAFDINDWHTK 
S P G I QRD I S F VI LRAQKPLMYVAE P FLP FTLGTYMLVLKNC YRLLALMQES M 

noR2 5nt 

ATGAACGACTCGGGTTATCAATCAAATCTCAGCCTTCTGCGGGTTTTTCTCGACGAGT 
TCCGATCGGTTCTGCGGCAGGAAAGTCCCGGTCTCATCCCACGCCTGGCTTTTTACTA 
TGTTCGCGCCTTTCTGAGCTTGCCCCTGTACCGATGGATCAACTTGTTCATCATGTGC 
AATGTGATGACCATTTTCTGGACCATGTTCGTGGCCCTGCCCGAGTCGAAGAACGTGA 
TCGAAATGGGCGACGACTTGGTTTGGATTTCGGGGATGGCACTGGTGTTCACCAAGAT 
CTTTTACATGCATTTGCGTTGCGACGAGATCGATGAACTTATTTCGGATTTTGAATAC 
TACAACCGGGAGCTGAGACCCCATAATATCGATGAGGAGGTGTTGGGTTGGCAGAGAC 
TGTGCTACGTGATAGAATCGGGTCTATATATCAACTGCTTTTGCCTGGTCAACTTCTT 
CAGTGCCGCTATTTTCCTGCAACCTCTGTTGGGCGAGGGAAAGCTGCCCTTCCACAGC 
GTCTATCCGTTTCAATGGCATCGCTTGGATCTGCATCCCTACACGTTCTGGTTCCTCT 
ACATCTGGCAGAGTCTGACCTCGCAGCACAACCTAATGAGCATTCTAATGGTGGATAT 
GGTAGGCATTTCCACGTTCCTCCAGACGGCGCTCAATCTCAAGTTGCTTTGCATCGAG 
ATAAGGAAACTGGGGGACATGGAGGTCAGTGATAAGAGGTTCCACGAGGAGTTTTGTC 
GTGTGGTTCGCTTCCACCAGCACATTATCAAGTTGGTGGGGAAAGCCAATAGAGCTTT 
CAATGGCGCCTTCAATGCACAATTAATGGCCAGTTTCTCCCTGATTTCCATATCCACT 
TTCGAGACCATGGCTGCAGCGGCTGTGGATCCCAAAATGGCCGCCAAGTTCGTGCTTC 
TCATGCTGGTGGCATTCATTCAACTGTCGCTTTGGTGCGTCTCTGGAACTTTGGTTTA 
TACTCAGTCAGTGGAGGTGGCTCAGGCTGCTTTTGATATCAACGATTGGCACACCAAA 
TCGCCAGGCATCCAGAGGGATATATCCTTTGTGATACTACGAGCCCAGAAACCCCTGA 
TGTATGTGGCCGAACCATTTCTGCCCTTCACCCTGGGAACCTATATGCTTGTACTGAA 
GAACTGCTATCGTTTGCTGGCCCTGATGCAAGAATCGATGTAG 
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DOR28 

MYSPEEAAELKRRNYRSIREMIRLSYTVGFNLLDPSRCGQVLRIWTIVLSVSSLASLY 
GHWQMLARY IHDI PRIGETAGTALQFLTS I AKMWYFLFAHRQ I YELLRKARCHELLQK 
CELFERMSDLPVIKEIRQQVESTMNRYWASTRRQILIYLYSCICITTNYFINSFVINL 
YRYFTKPKGSYDIMLPLPSLYPAWEHKGLEFPYYHIQMYLETCSLYICGMCAVSFDGV 
F I VLCLHS VGLMRS LNQMVEQATSELVPPDRRVEYLRCC I YQ YQRVANFATEVNNCFR 
H I TFTQ FLLS LFNWGLAL FQMS VGLGNNS S I TMI RMTMYL VAAG YQ I WYCYNGQRFA 
TASEE I ANAFYQVRW YGESREFRHL I RMMLMRTNRGFRLDVS WFMQMSLPTLMAVSSG 
AEQSRGPAGPAGPAGPPPRVPSYSQFHLIDSQMVRTSGQYFLLLQNVNQK 

DOR28nt 

ATGTACTCACCGGAAGAGGCGGCCGAACTGAAGAGGCGCAACTATCGCAGCATCAGG 

GAGATGATCCGACTCTCCTATACGGTGGGCTTCAACCTGTTGGATCCTTCCCGATGCG 

GACAGGTGCTCAGAATCTGGACAATTGTCCTTAGCGTGAGTAGCTTGGCATCGCTTTA 

TGGGCACTGGCAAATGTTAGCCAGGTACATTCATGATATTCCACGCATTGGAGAGACC 

GCTGGAACTGCCCTGCAGTTCCTAACATCGATAGCAAAGATGTGGTACTTTCTGTTTG 

CCCATAGACAGATATACGAATTGCTACGAAAGGCGCGCTGCCATGAATTACTCCAAAA 

GTGTGAGCTCTTTGAAAGGATGTCAGATCTACCTGTTATCAAAGAGATTCGCCAGCAG 

GTTGAGTCCACGATGAATCGGTACTGGGCCAGCACTCGTCGGCAAATTCTTATCTATT 

TGTACAGCTGTATTTGTATTACTACAAACTACTTTATCAACTCCTTCGTAATCAACCT 

CTATCGCTATTTCACTAAACCGAAAGGATCCTACGACATAATGTTACCTCTGCCATCT 

CTGTATCCCGCCTGGGAGCACAAGGGATTAGAGTTTCCCTACTATCATATACAGATGT 

ACCTGGAAACCTGTTCTCTGTATATCTGCGGCATGTGTGCCGTTAGCTTTGATGGAGT 

CTTTATTGTCCTGTGCCTTCATAGCGTGGGACTTATGAGGTCACTTAACCAAATGGTG 

GAACAAGCCACATCTGAGTTGGTTCCTCCAGATCGCAGGGTTGAATACTTGCGATGCT 

GTATTTATCAGTACCAACGAGTGGCGAACTTTGCAACCGAGGTTAACAACTGCTTTCG 

GCACATCACTTTCACGCAGTTCCTGCTTAGCCTTTTCAACTGGGGCCTGGCCTTGTTC 

CAAATGAGCGTCGGATTGGGCAACAACAGCAGCATCACCATGATCCGGATGACCATGT 

ACCTGGTGGCAGCCGGCTATCAGATAGTTGTGTACTGCTACAATGGCCAGCGATTTGC 

GACTGCTAGCGAGGAGATTGCCAACGCCTTTTACCAGGTGCGATGGTACGGAGAGTCC 

AGGGAG7TCCGCCACCTCATCCGCATGATGCTGATGCGCACGAACCGGGGATTCAGGC 

TGGACGTGTCCTGGTTCATGCAAATGTCCTTGCCCACACTCATGGCGGTGAGTAGCGG 

AGCAGAGCAGAGCAGGGGTCCTGCAGGTCCTGCAGGTCCTGCAGGTCCACCCCCAAGG 

GTCCCCTCCTACAGCCAGTTCCACTTGATTGATTCGCAGATGGTCCGGACAAGTGGAC 

AGTACTTCCTGCTGCTGCAGAACGTCAACCAGAAA 
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DOR30 

MAVS TRVATKQEVPESRRAFRNLFNCFYALGMQAPDGSRPTTS STWQRI YACFSWMY 
WQLLLVPTFFVISYRYMGGMEITQVLTSAQVAIDAVILPAKIVA1AWNLPLLRRAEH 
HLAALDARCREQEEFQLI LDAVRFCNYLVWFYQI CYAI YS SSTFVCAFLLGQPPYALY 
5 LPGLDWQRSQMQFCIQAWIEFLIMNWTCLHQASDDVYAVIYLYWRIQVQLLARRVEK 
LGTDDS GQVE I YPDERRQEEHCAELQRCI VDHQTMLQLLDC I S PVI SRT I FVQFL I TA 
AINGTTMINI F I FANTNTKI AS 1 1 YLLAVTLQTAPCCYQATSLMLDNERLALAIFQCQ 
WLGQSARFRKMLLYYLHRAQQPITLTAMKLFPINLATYFSIAKFSFSLYTLIKGMNLG 
ERFNRTN 

10 

DOR3 0nt 

ATGGCGGTGAGCACTCGTGTGGCCACAAAGCAGGAAGTGCCCGAATCCCGGCGAGCGT 
TTAGGAATCTCTTCAATTGCTTCTATGCCCTTGGCATGCAGGCACCGGATGGCAGTCG 
ACCGACCACGAGCAGCACATGGCAACGCATCTACGCCTGCTTCTCGGTGGTCATGTAC 

1 5 GTGTGGCAACTGCTGCTGGTGCCCACATTCTTTGTGATCAGCTATCGGTACATGGGCG 
GCATGGAGATTACCCAGGTGCTGACCTCCGCCCAGGTGGCCATCGATGCGGTCATTCT 
GCCGGCCAAGATTGTGGCACTGGCGTGGAATTTGCCATTGCTGCGCAGAGCAGAGCAT 
CATCTGGCCGCCTTGGATGCGCGGTGCAGGGAACAGGAGGAGTTCCAATTGATCCTCG 
ATGCGGTGAGGTTTTGCAACTATCTGGTATGGTTCTACCAGATCTGCTATGCCATCTA 

2 0 CTCCTCGTCGACATTTGTGTGCGCCTTCCTGCTGGGCCAACCGCCATATGCCCTCTAT 
TTGCCTGGCCTCGATTGGCAGCGTTCCCAGATGCAGTTCTGCATCCAGGCCTGGATTG 
AGTTCCTTATCATGAACTGGACGTGCCTGCACCAAGCTAGCGATGATGTGTACGCCGT 
TATCTATCTGTATGTGGTCCGGATTCAAGTGCAATTGCTGGCCAGGCGGGTGGAGAAG 
CTGGGCACGGATGATAGTGGCCAGGTGGAGATCTATCCCGATGAGCGGCGGCAGGAGG 

2 5 AGCATTGCGCGGAACTGCAGCGCTGCATTGTAGATCACCAGACGATGCTGCAGCTGCT 

CGACTGCATTAGTCCCGTCATCTCGCGTACCATATTCGTTCAGTTCCTGATCACCGCC 
GCCATCATGGGCACCACCATGATCAACATTTTCATTTTCGCCAATACGAACACGAAGA 
TCGCATCGATCATTTACCTGCTGGCGGTGACCCTGCAGACGGCTCCATGTTGCTATCA 
GGCCACCTCGCTGATGTTGGACAACGAGAGGCTGGCCCTGGCCATCTTCCAGTGCCAG 

3 0 TGGCTGGGCCAGAGTGCCCGGTTCCGTAAGATGCTGCTCTACTATCTTCATCGCGCCC 

AGCAGCCCATCACGCTGACCGCCATGAAGCTGTTTCCCATCAATCTGGCCACGTACTT 
CAGTATAGCCAAGTTCTCGTTTTCGCTCTACACGCTCATCAAGGGGATGAATCTCGGC 
GAGCGATTCAACAGGACAAAT 

35 DOR31 

MIFKYIQEPVLGSLFRSRDSLIYLNRSIDQMGWRLPPRTKPYWWLYYIWTLVVIVLVF 
I FI PYGLIMTGI KEFKNFTTTDLFTYVQVPVNTNAS IMKGI I VLFMRRRFSRAQKMMD 
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AMDIRCTKMEEKVQVHRAAALCNRVWIYHCIYFGYLSMALTGALVIGKTPFCLYNPL 
VNPDDHFYLATAIESVTI^GIIIANLIIiDVYPIIYVVVLRIHMELLSERIKTLRTDVE 
KGDDQHYAE L VE CVKDHKL I VE YGNTLRPM I SATMFI QLLSVGLLLGLAAVSMQFYNT 
VMERWSGVYTIAILSQTFPFCYVCEQLSSDCESLTNTLFHSKWIGAERRYRTTMLYF 
5 I HNVQQS I LFTAGG I FP I CLNTN I KMAKFAFS WTI VNEMDLAEKLRRE 

DOR3 lnt 

ATGATTTTTAAGTACATTCAAGAGCCAGTCCTTGGATCCTTATTTCGATCCCGGGATT 
CGCTGATCTACTTAAACAGATCCATAGATCAAATGGGATGGAGACTGCCGCCACGAAC 

1 0 TAAGCCGTACTGGTGGCTCTATTACATTTGGACATTGGTGGTCATAGTACTCGTCTTT 
ATCTTTATACCCTATGGACTGATAATGACTGGAATAAAGGAGTTCAAGAACTTCACGA 
CCACGGATCTGTTTACGTATGTCCAGGTGCCGGTTAACACCAATGCTTCGATCATGAA 
GGGCATTATAGTGTTGTTTATGCGGCGGCGATTTTCAAGGGCTCAGAAGATGATGGAC 
GCCATGGACATTCGATGCACCAAGATGGAGGAGAAAGTCCAGGTGCACCGAGCAGCAG 

1 5 CCTTATGCAATCGTGTTGTTGTGATTTACCATTGCATATACTTCGGCTATCTATCCAT 
GGCCTTAACCGGAGCTCTGGTGATTGGGAAGACTCCATTCTGTTTGTACAATCCACTG 
GTTAACCCCGACGATCATTTCTATCTGGCCACTGCCATTGAATCGGTCACCATGGCTG 
GCATTATTCTGGCCAATCTCATTTTGGACGTATATCCCATCATATATGTGGTCGTTCT 
GCGGATCCACATGGAGCTCTTGAGTGAGCGAATCAAGACGCTGCGTACTGATGTGGAA 

2 0 AAAGGCGACGATCAACATTATGCCGAGCTGGTGGAGTGTGTAAAGGATCACAAGCTAA 
TTGTCGAATATGGAAACACTCTGCGTCCCATGATATCCGCCACGATGTTCATCCAACT 
ACTATCCGTTGGCTTACTTTTGGGTCTGGCAGCGGTGTCCATGCAGTTCTATAACACC 
GTAATGGAGCGTGTTGTCTCCGGGGTCTACACCATAGCCATTCTATCCCAGACCTTTC 
CATTTTGCTATGTCTGTGAGCAGCTGAGCAGCGATTGCGAATCCCTGACCAACACACT 

2 5 GTTCCATTCCAAGTGGATTGGAGCTGAGCGACGATACAGAACCACGATGTTGTACTTC 

ATTCACAATGTTCAGCAGTCGATTTTGTTCACTGCGGGCGGAATTTTCCCCATATGTC 
TAAACACCAATATAAAGATGGCCAAGTTCGCTTTCTCAGTGGTGACCATTGTAAATGA 
GATGGACTTGGCCGAGAAATTGAGAAGGGAG 

30 DOR32 

ME P VQ YS YED FARL PTTVFW I MG YDMLG VPKTRS RR I L YW I YRFLCLASHGVCVGVMV 
FRMVEAKTIDNVSLIMRYATLVTYI INSDTKFATVLQRSAIQSLNSKLAELYPKTTLD 
R I YHRVNDHYWTKS FVYLVI I YI GS S IMWI GP 1 1 TS 1 1 AYFTHNVFTYMHCYP YFLY 
DPEKDPWIYI S I YALEWLHSTQMVI SNIGADI WLLYFQVQINLHFRGI IRSLADHKP 

3 5 S VKHDQEDRKF I AKI VDKQVHLVSLQNDLNG I FGKS LLLSLLTTAAVI CTVAVYTL I Q 

GPTLEGFTYVI FI GTS VMQVYLVCYYGQQVLDLSGEVAHAVYNHDFHDAS IAYKRYLL 
1 1 1 IRAQQPVELjNAMGYLS I SLDTFKQLMSVSYRVITMLMQMIQ 
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DOR3 2nt 

ATGGAACCTGTGCAGTACAGCTACGAGGATTTCGCTCGATTGCCCACGACGGTGTTCT 
GGATCATGGGCTACGACATGCTGGGCGTTCCGAAGACCCGCTCTCGCAGGATACTATA 
CTGGATATATCGTTTCCTCTGTCTCGCCAGCCATGGGGTCTGTGTAGGAGTCATGGTA 
5 TTTCGTATGGTGGAGGCAAAGACCATTGACAATGTTTCGCTGATCATGCGGTATGCCA 
CTCTGGTCACCTATATCATCAACTCGGATACGAAATTCGCAACTGTCTTACAAAGGAG 
TGCAATTCAAAGTCTAAACTCAAAACTGGCCGAACTATATCCGAAGACCACGCTGGAC 
AGGATCTATCACCGGGTGAATGATCACTATTGGACCAAGTCATTTGTATATTTGGTTA 
TTATCTACATTGGTTCGTCGATTATGGTTGTTATTGGACCGATTATTACGTCGATTAT 

1 0 AGCTTACTTCACGCACAACGTTTTCACCTACATGCACTGCTATCCGTACTTTTTGTAT 
GATCCTGAGAAGGATCCGGTTTGGATCTACATCAGCATCTATGCTCTGGAATGGTTGC 
ACAGCACACAGATGGTCATTTCGAACATTGGCGCGGATATCTGGCTGCTGTACTTTCA 
GGTGCAGATAAATCTCCACTTCAGGGGCATTATACGATCACTGGCGGATCACAAGCCC 
AGTGTGAAGCACGACCAGGAGGACAGGAAATTCATTGCGAAAATTGTCGACAAGCAGG 

1 5 TGCACCTGGTCAGTTTGCAAAACGATCTGAATGGTATCTTTGGAAAATCGCTGCTTCT 
AAGCCTGCTGACCACCGCAGCGGTTATCTGCACGGTGGCGGTGTACACTCTGATTCAG 
GGTCCCACCTTGGAGGGCTTCACCTATGTGATCTTCATCGGGACTTCTGTGATGCAGG 
TCTACCTGGTGTGCTATTACGGTCAGCAAGTTCTCGACTTGAGCGGCGAGGTGGCCCA 
CGC CGTGTAC AATCATGATTTTCACGATGCTTCTATAG CGTACAAGAGGTAC CTGCTC 

2 0 ATAATCATTATCAGGGCGCAGCAGCCCGTGGAACTTAATGCCATGGGCTACCTGTCCA 
TTTCGCTGGACACCTTTAAACAGCTGATGAGCGTCTCCTACCGGGTTATAACCATGCT 
CATGCAGATGATTCAG 



2 5 **protein sequence is incomplete and is in progress** 

KVD S TRAL VNHWRI FR I MG I HP PGKRTFWGRH YTAY S MVWNVTFH I CI WVS FS VNLLQ 
SNSLETFCESLCVTMPHTLYMLKLINVRRMRGQMI SSHWLLRLLDKRX.GCDDERQI IM 
AG I ERAEF I FRT I FRGLACTWLG I IYISAS SEPTLMYPTWI P WNWRDS TSAYLATAM 
LHTTALMANATLVLNLSS YPGTYL I LVS VHTKALALRVS KLGYGAPLPAVRMQAI LVG 

3 0 YIHDHQIILR*VSGNLISQCKNF*SISGVLTFIERRMYTHFGVPNIFIVIEDYYILFL 

NYSLFKSLERSLSMTCFLQFFSTACAQCTI CYFLLFGNVGIMRFMNMLFLLVILTTET 
LLLCYTAELPCKEGESLLTAVYSC1TOLSQSWFRRLLLLMLARCQIPMILVSGVIVPI 
SMKTF 



35 DOR37nt 

**inf ormation on nucleotide sequence is in progress** 
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DOR3 8 

MRLI KI S YSALNEVCVWLKLNGSWPLTESSRPWRSQSLLATAYIWAWYVI ASVGITI 
S YQTAFLLNNLSD 1 1 ITTENCCTTFMGVLNFVRLIHLRLNQRKFRQLI ENFSYEIWI P 
NS S KNNVAAE CRRRMVTFS I MTS LLACL 1 1 MY CVLPLVE I F FGP AFDAQNKP F P YKMI 
5 FPYDAQSSWIRYVMTYIFTSYAGICWTTLFAEDTILGFFITYTCGQFHLLHQRIAGL 
FAGSNAELAES I QLERLKRI VEKHNNI I SANSV 

DOR3 8nt 

ATGCGTTTGATCAAAATTTCATATTCGGCACTTAATGAGGTGTGCGTTTGGCTGAAAC 
1 0 TGAATGGTTCTTGGCCATTAACCGAATCATCGAGGCCATGGAGGAGCCAATCCTTATT 
GGCCACCGCCTACATCGTGTGGGCGTGGTACGTCATTGCATCTGTGGGCATAACAATC 
AGCTATCAGACGGCCTTTTTGCTGAACAACCTTTCGGACATTATTATCACCACGGAAA 
ATTGTTGCACCACCTTTATGGGTGTCCTGAACTTTGTCCGACTCATCCATCTTCGCCT 
CAATCAGAGGAAATTCCGCCAGCTTATTGAGAACTTTTCCTACGAAATTTGGATACCT 
1 5 AATTCTTCCAAAAACAATGTTGCCGCCGAGTGTCGCAGACGCATGGTTACCTTCAGCA 
TAATGACATCCTTGCTAGCGTGCCTGATCATAATGTATTGTGTCCTGCCGCTGGTGGA 
GATCTTCTTTGGACCCGCCTTCGATGCACAGAACAAGCCGTTTCCCTACAAGATGATC 
TTTCCGTACGATGCCCAGAGCAGTTGGATCCGATATGTGATGACCTACATCTTCACCT 
CCTACGCGGGAATCTGTGTGGTCACCACCTTGT.TTGCAGAGGACACCATTCTTGGCTT 
2 0 CTTCATAACCTACACTTGTGGCCAATTTCATTTGCTACACCAACGAATCGCAGGTTTA 
TTTGCGGGTTCCAATGCGGAATTGGCCGAGAGCATTCAGCTGGAGCGACTCAAACGTA 
TTGTGGAAAAACACAACAATATTATCAGCGCAAATTCTGTA 

DOR44 

2 5 MKS TFKEERI KDDSKRRDLFVFVRQTMC I AAMYP FGYYWGSGVlaAVLVRFCDLTYEL 

FNYFVSVHI AGLYI CTI YINYGQGDLDFFVNCLI QTI I YLWTIAMKLYFRRFRPGLLN 
TILSNINDEYETRSAVGFSFVTMAGSYRMSKIjWIKTYVYCCYIGTIFWLALPIAYRDR 
SLPLACWYPFDYTQPGVYEWFLLQAMGQIQVAASFASSSGLHMyLCVLISGQYDVLF 
CSLKNVLASSYVLMGANMTELNQLQAEQSAADVEPGQYAYSVEEETPLQELLKVGSSM 

3 0 DFS S AFRLS FVRC I QHHRYI VAALKKI E S FYS P I WFVKI GEVTFLMCLVAFVSTKSTA 

ANSFMRMVSLGQYLLLVLYELFI I CYFADI VFQNSQRCGEALWRSPWQRHLKDVRSDY 
MFFMLNSRRQFQLTAGKI SNLNVDRFRGVGI LT 

DOR44nt 

3 5 ATGAAGAGCACATTCAAGGAAGAAAGGATTAAGGACGACTCCAAGCGTCGCGACCTGT 
TTGTATTCGTGAGGCAAACCATGTGTATAGCGGCCATGTATCCCTTCGGTTACTACGT 
GAATGGATCTGGAGTCCTGGCCGTTCTGGTGCGATTCTGTGACTTGACCTACGAGCTC 
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TTTAACTACTTCGTTTCGGTACACATAGCTGGCCTGTACATCTGCACCATCTACATCA 
ACTATGGGCAAGGCGATTTGGACTTCTTCGTGAACTGTTTGATACAAACCATTATTTA 
TCTGTGGACAATAGCGATGAAACTCTACTTTCGGAGGTTCAGACCTGGTTTGTTGAAT 
ACCATTCTGTCCAACATCAATGATGAGTACGAGACACGTTCGGCTGTGGGATTCAGTT 
5 TCGTCACAATGGCGGGATCCTATCGGATGTCCAAGCTATGGATCAAAACCTATGTGTA 
TTGCTGCTACATAGGCACCATTTTCTGGCTGGCTCTTCCCATTGCCTACCGGGATAGG 
AGTCTTCCTCTTGCCTGCTGGTATCCCTTTGACTATACACAACCCGGTGTCTATGAGG 
TAGTGTTCCTTCTCCAGGCGATGGGACAGATCCAAGTGGCCGCATCCTTTGCCTCCTC 
CAGTGGCCTGCATATGGTGCTTTGTGTGCTGATATCAGGGCAGTACGATGTCCTCTTT 

1 0 TGCAGTCTCAAGAATGTATTAGCCAGCAGCTATGTCCTTATGGGAGCCAATATGACGG 
AACTGAATCAATTGCAGGCTGAGCAATCTGCGGCCGATGTCGAGCCAGGTCAGTATGC 
TTACTCCGTGGAGGAGGAGACACCTTTGCAAGAACTTCTAAAAGTTGGGAGCTCAATG 
GACTTCTCCTCCGCATTCAGGCTGTCTTTTGTGCGGTGCATTCAGCACCATCGATACA 
TAGTGGCGGCACTGAAGAAAATTGAGAGTTTCTACAGTCCCATATGGTTCGTGAAGAT 

1 5 TGGCGAAGTCACCTTTCTTATGTGCCTGGTAGCCTTCGTCTCCACGAAGAGCACCGCG 
GCCAACTCATTCATGCGAATGGTCTCCTTGGGCCAGTACCTGCTCTTAGTTCTCTACG 
AGCTGTTCATCATCTGCTACTTCGCGGACATCGTTTTTCAGAACAGCCAGCGGTGCGG 
TGAAGCCCTCTGGCGAAGTCCTTGGCAGCGACATTTGAAGGATGTTCGCAGTGATTAC 
ATGTTCTTTATGCTGAATTC CCGCAGGCAGTTC CAACTTACGGCCGGAAAAATAAGCA 

2 0 ATCTAAACGTGGATCGTTTCAGAGGGGTGGGTATCCTTACT 

DOR46 

MAEVRVDSLEFFKSHWTAWRYLGVAHFRVENWKNLYVFYS I VSNLLVTLCYPVHLGI S 
LFRNRT ITED I LNLTTFATCTACS VKCLLYAYNI KDVLEMERLLRLLDERWGPEQRS 

2 5 I YGQ VR VQLRNVL YVF I G I YMP CAL FAELS FLFKEERGLMYPAWFP FDWLHSTRNYYI 

ANAYQI VGI S FQLLQNYVSDCFPAWLCL ISSHI KMLYNRFEEVGLDPARDAEKDLEA 
CITDHKHILELFRRIEAFISLPMLIQFTVTAI^CIGLAALVFFVSEPMARMYFIFYS 
LAMPLQIFPSCFFGTDNEYWFGRLHYAAFSCNWHTQNRSFKRKMMLFVEQSLKKSTAV 
AGGMMRIHLDTFFSTLKGAYSLFTII IRMRK 

30 

DOR46nt 

ATGGCAGAGGTCAGAGTGGACAGTCTGGAGTTTTTCAAGAGCCATTGGACCGCCTGGC 
GGTACTTGGGAGTGGCTCATTTTCGGGTCGAGAACTGGAAGAACCTTTACGTGTTTTA 
CAGCATTGTGTCGAATCTTCTCGTGACCCTGTGCTACCCCGTTCACCTGGGAATATCC 

3 5 CTCTTTCGCAACCGCACCATCACCGAGGACATCCTCAACCTGACCACCTTTGCGACCT 

GCACAGCCTGTTCGGTGAAGTGCCTGCTCTACGCCTACAACATCAAGGATGTGCTGGA 
GATGGAGCGGCTGTTGAGGCTTTTGGATGAACGCGTCGTGGGTCCGGAGCAACGCAGC 
ATCTACGGACAAGTGAGGGTCCAGCTGCGAAATGTGCTATACGTGTTCATCGGCATCT 
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ACATGCCGTGTGCCCTGTTCGCCGAGCTATCCTTTCTGTTCAAGGAGGAGCGCGGTCT 
GATGTATCCCGCCTGGTTTCCCTTCGACTGGCTGCACTCCACCAGGAACTATTACATA 
GCGAACGCCTATCAGATAGTGGGCATCTCGTTTCAGCTGCTGCAAAACTATGTTAGCG 
ACTGCTTTCCGGCGGTGGTGCTGTGCCTGATCTCATCCCACATCAAAATGTTGTACAA 
5 CAGATTCGAGGAGGTGGGCCTGGATCCAGCCAGAGATGCGGAGAAGGACCTGGAGGCC 
TGCATCACCGATCACAAGCATATTCTAGAGTGGGCAGGCGGCTCATTGGTTCGTGTTC 
TATTCACTTTCCAACTTTTTTCCAGACTATTCCGACGCATCGAGGCCTTCATTTCCCT 
GCCCATGCTAATTCAGTTCACAGTGACCGCCTTGAATGTGTGCATCGGTTTAGCAGCC 
CTGGTGTTTTTCGTCAGCGAGCCCATGGCACGGATGTACTTCATCTTCTACTCCCTGG 
1 0 CCATGCCGCTGCAGATCTTTCCGTCCTGCTTTTTCGGCACCGACAACGAGTACTGGTT 
CGGACGCCTCCACTACGCGGCCTTCAGTTGCAATTGGCACACACAGAACAGGAGCTTT 
AAGCGGAAAATGATGCTGTTCGTTGAGCAATCGTTGAAGAAGAGCACCGCTGTGGCTG 
GCGGAATGATGCGTATCCACCTGGACACGTTCTTTTCCACCCTAAAGGGGGCCTACTC 
CCTCTTTACCATCATTATTCGGATGAGAAAG 

15 

DOR48 

MERHYFMVPKFALS LI GFYPEQKRTVLVKLWS FFNFF I LTYGCYAEAYYG IHYI P I N I 
ATALDALCPVASSILSLVKMVAIWWYQDELRSLIERRFYTLATQLTFLLLCCGFCTST 
SYSVRHLIDNILRRTHGKDWIYETPFKMMFPDLLLRLPLYPITYILVHWHGYITWCF 

2 0 VGADGFFLGFCLYFTVLLLCLQDDVCDLLEVENIEKSPSEAEEARIVREMEKLVDRHN 

EVAELTERLSGVMVEITLAHFVTSSLI IGTSWDILLFSGLGI IVYWYTCAVGVEIF 
L YCLGGS HI MEACSNLARS TFS SHWYGHS VRVQKMTLLMVARAQRVLT I KI PFFS P SL 
ETLTS ILRFTGSLIALAKSVI 

25 DOR4 8nt 

ATGGAGCGCCATTATTTCATGGTGCCAAAGTTTGCATTATCGCTGATTGGTTTTTATC 
CCGAACAGAAGCGAACGGTTTTGGTGAAACTTTGGAGTTTCTTCAACTTTTTCATCCT 
CACCTACGGCTGTTATGCAGAGGCTTACTATGGCATACACTATATACCGATTAACATA 
GCCACTGCATTGGATGCCCTTTGTCCTGTGGCCTCCAGCATTTTGTCGCTGGTGAAAA 

3 0 TGGTCGCCATTTGGTGGTATCAAGATGAATTAAGGAGTTTGATAGAGCGGGTAAGATT 

TTTAACAGAGCAACAGAAGTCCAAGAGGAAACTGGGCTATAAGAAGAGGTTCTATACA 
CTGGCAACGCAACTAACATTCCTGCTACTATGCTGTGGATTTTGCACCAGTACTTCCT 
ATTCCGTCAGACATTTGATTGATAATATCCTGAGACGCACCCATGGCAAGGACTGGAT 
CTACGAGACTCCGTTCAAGATGATGTAAGGAAAGGGAAGAATGGTTTATATATACTTT 
3 5 TGGAACGAAATAATGATGTGATCTAAACAAGATGCACTTTTTTTTAGGTTCCCCGATC 
TTCTCCTGCGTTTGCCACTCTATCCCATCACCTATATACTCGTGCATTGGCATGGCTA 
CATTACTGTGGTTTGTTTTGTCGGCGCGGATGGTTTCTTCCTGGGGTTCTGTTTGTAC 
TTCACTGTTTTGCTGCTCTGTCTGCAGGACGATGTTTGTGATTTACTAGAGGTTGAAA 
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ACATCGAGAAGAGTCCCTCCGAAGCGGAGGAAGCTCGCATAGTTCGGGAAATGGAAAA 
ACTGGTGGACCGGCATAACGAGGTGGCCGAGCTGACAGAAAGATTGTCGGGTGTTATG 
GTGGAAATAACACTGGCCCACTTTGTTACTTCGAGTTTGATAATCGGAACCAGCGTGG 
TGGATATTTTATTAGTGGGTATTTACATTTGATTAGATCCTTTCGATATATGTTCTTA 
5 AATTCTAGTTTTCCGGCCTGGGAATCATTGTGTATGTGGTCTACACTTGTGCCGTAGG 
TGTGGAAATATTTCTATACTGTTTAGGAGGATCTCATATTATGGAAGCGGTATATTCA 
TAAGAAACTACTATAAAGTTACTTTTAAATTCATTGCATTTCTTAGTGTTCCAATCTA 
GCGCGCTCCACATTTTCCAGCCACTGGTATGGCCACAGTGTTCGGGTCCAAAAGATGA 
CCCTTTTGATGGTAGCTCGTGCTCAACGAGTTCTCACAATTAAAATTCCTTTCTTTTC 
1 0 CCCATCATTAGAGACTCTAACTTCGGTAAGCTTATGCGAAAATGTTATGGTACACACA 
AGTCTACATTTCTATGAGGTCTTGTAGATTTTGCGCTTCACTGGATCTCTGATTGCCC 
TGGCAAAGTCGGTTATA 

DOR53 

1 5 MLSKFFPHIKEKPLSERVKSRDAFIYLDRVMWSFGWTEPENKRWILPYKLWLAFVNIV 
MLILLPI S I SIEYLHRFKTFSAGEFLSSLEIGVNMYGSSFKCAFTLIGFKKRQEAKVL 
LDQLDKRCLSDKERSTVHRYVAMGNFFDILYHIFYSTFVVMNFPYFLLERRHAWRMYF 
PYIDSDEQFYISSIAECFLMTEAIYMDLCTDVCPLISMLMARCHISLLKQRiRIJLRSK 
PGRTEDEYLEELTECIRDHRLLLDYVDALRPVFSGTIFVQFLLIGTVLGLSMINLMFF 

2 0 STFWTGVATCLFMFDVSMETFFFCYLCNMI IDDCQEMSNCLFQSDWTSADRRYKSTLV 
YFLHNLQQPI TLTAGGVFP I SMQTKnLAMVKIAFSVVTVI KQFNLAERFQ 

DOR53nt 

TCAAACAAAGCCACGGACAAGATGTTAAGCAAGTTTTTTCCCCACATAAAAGAAAAGC 

2 5 CATTGAGCGAGCGGGTTAAGTCCCGAGATGCCTTCATTTACTTGGATCGGGTGATGTG 

GTCCTTTGGCTGGACAGAGCCTGAAAACAAAAGGTGGATCCTTCCTTATAAACTGTGG 
TTAGCGTTCGTGAACATAGTAATGCTCATCCTTCTGCCGATCTCGATAAGCATCGAGT 
ACCTCCACCGATTTAAAACCTTCTCGGCGGGGGAGTTCCTTAGTTCCCTCGAGATTGG 
AGTCAACATGTACGGAAGCTCTTTTAAGTGCGCCTTCACCTTGATTGGATTCAAGAAA 

3 0 AGACAGGAAGCTAAGGTTTTACTGGATCAGCTGGACAAGAGATGCCTTAGCGATAAGG 

AGAGGTCCACTGTTCATCGCTATGTCGCCATGGGAAACTTTTTCGATATTTTGTATCA 
CATTTTTTACTCCACCTTCGTGGTAATGAACTTCCCGTATTTTCTGCTTGAGAGACGC 
CATGCTTGGCGCATGTACTTTCCATATATCGATTCCGACGAACAGTTTTACATCTCCA 
GCATCGCCGAGTGTTTTCTGATGACGGAGGCCATCTACATGGATCTCTGTACGGACGT 
3 5 GTGTCCCTTGATCTCCATGCTTATGGCTCGATGCCACATCAGCCTCCTGAAACAGCGA 
CTGAGAAATCTCCGATCGAAGCCAGGAAGGACCGAAGATGAGTACTTGGAGGAGCTCA 
CCGAGTGCATTCGGGATCATCGATTGCTATTGGACTATGTTGACGCATTGCGACCCGT 
CTTTTCGGGAACCATTTTTGTGCAGTTCCTCCTGATCGGTACTGTACTGGGTCTCTCA 
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ATGATAAATCTAATGTTCTTCTCGACATTTTGGACTGGTGTCGCCACTTGCCTTTTTA 
TGTTCGACGTGTCCATGGAGACGTTCCCCTTTTGCTATTTGTGCAACATGATTATCGA 
TGACTGCCAGGAAA.TGTCCAATTGCCTCTTTCAATCGGACTGGACCTCTGCCGATCGT 
CGCTACAAATCCACTTTGGTATACTTTCTTCACAATCTTCAGCAACCCATTACTCTCA 
5 CGGCTGGTGGAGTGTTTCCTATTTCCATGCAAACAAATTTGGCTATGGTGAAGCTGGC 
ATTTTCTGTGGTTACGGTAATTAAGCAATTTAACTTGGCCGAAAGGTTTCAATAAGTT 
GAGAGGGACGAGCTCTGCTACTATTATATTATATATTATATTATATTATATATATATT 
ATTTTATATTATATATTGCTGTACCCTAATAAATATTTAGTAATAAAAAAAAAAAAAA 
AAAA 

10 

DQR56 

MDPVEMPI FGSTLKLMKFWSYLFVHNWRRYVAMTPY 1 1 INCTQYVDI YLSTESLDFI I 
R1TVYLAVLFTNTVVRGVLLCVQRFSYERFINILKS FYI ELLVSTERLS QKCI LHKWAV 
LPYGMYLPTIDEYKYASPYYEIFFVIQAIMAPMGCCMYIPYTNMWTFTLFAILMCRV 
1 5 LQHKLRSLEKLKNEQVRGEIAQTIAQTVIVIAYMVMI FANS WLYYVANELYFQ S FD I 
AI AAYE SNWMDFDVDTQKTLKFL I MRS QKPLASLVGGTYPMNLKMLQS LLNAI YS F FT 
LLRRVYG 

DOR56llt 

2 0 ATGGATCCGGTGGAGATGCCCATTTTTGGTAGCACTCTGAAGCTAATGAAGTTCTGGT 
CATATCTGTTTGTTCACAACTGGCGCCGCTATGTCGCAATGACTCCGTACATCATTAT 
CAACTGTACTCAGTATGTGGATATATATCTGAGCACCGAATCCTTGGACTTTATCATC 
AGAAATGTATACCTGGCTGTATTGTTTACCAACACGGTGGTCAGAGGTGTATTGTTAT 
GCGTACAGCGGTTTAGCTACGAGCGTTTCATTAATATTTTGAAAAGCTTTTACATTGA 

2 5 GTTGTTGGTGAGTACCGAAAGATTATCTCAAAAATGCATATTGCATAAATGGGCAGTT 

CTGCCATATGGCATGTATTTGCCCACTATTGATGAATACAAATACGCATCACCTTACT 
ACGAGATTTTCTTTGTGATTCAAGCCATTATGGCTCCAATGGGGTGTTGCATGTACAT 
ACCATACACAAACATGGTAGTGACATTTACCCTTTTCGCCATTCTCATGTGTCGAGTG 
TTGCAACATAAGTTGAGAAGCCTAGAAAAGCTGAAAAATGAACAAGTACGTGGTGAAA 

3 0 TCGCTCAAACAATTGCTCAGACCGTCATAGTCATCGCATACATGGTAATGATATTTGC 

CAACAGTGTAGTCCTTTACTACGTGGCCAATGAGCTATACTTTCAAAGCTTTGATATT 
GCCATTGCTGCCTATGAGAGCAATTGGATGGACTTTGATGTGGACACACAAAAGACTT 
TGAAGTTCCTCATCATGCGCTCGCAAAAGCCCTTGGCGAGTCTGGTGGGTGGCACATA 
TCCCATGAACTTGAAAATGCTTCAGTCACTACTAAATGCCATTTACTCCTTCTTCACC 
3 5 CTTCTGCGTCGCGTTTACGGC 
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DOR58 

MDAS YFAVQRRALEIVGFDPSTPQLSLKHPIWAGI LILSLI SHNWPMWYALQDLSDL 
TRLTDNFAVFMQGSQSTFKFLVMMAKRRRIGSLIHRLHKLNQAASATPNHLEKIEREN 
QLDRYVARSFRNAAYGVICASAIAPMLLGLWGYVETGVFTPTTPMEFNFWLDERKPHF 
5 YWPIYWGVLGVAAAAWLAIATDTLFSWLTHNWIQFQLLELVLEEKDLNGGDSRLTG 
FVSRHRIALDLAKELSSIFGEIVFVKYMLSYLQLCMLAFRFSRSGWSAQVPFRATFLV 
AI 1 1 QLS S YC YGGEY I KQQSLAI AQAVYGQINWPEMTPKKRRLWQMVI MRAQRPAKI F 
G FMF WDLPLLLWVI RTAG S FLAMLRTFER 

10 DOR58nt 

ATGGACGCCAGCTACTTTGCCGTCCAGAGAAGAGCTCTGGAAATAGTTGGATTCGATC 
CCAGTACTCCGCAACTGAGTCTGAAACATCCCATCTGGGCCGGGATTCTCATCCTGTC 
CTTGATCTCTCACAACTGGCCCATGGTAGTCTATGCCCTGCAGGATCTCTCCGACTTG 
AC C CGTCTGACGGACAACTTTGCGGTGTTTATGCAAGGATCACAGAGCACCTTCAAGT 

1 5 TCCTGGTCATGATGGCGAAACGAAGGCGCATTGGATCGTTGATTCACCGTTTGCATAA 
GCTAAACCAGGCGGCCAGTGCCACGCCCAATCACCTGGAGAAGATCGAGAGGGAAAAC 
CAACTGGATAGGTATGTCGCCAGGTCCTTTAGAAATGCCGCCTACGGAGTGATTTGTG 
CCTCGGCCATAGCGCCCATGTTGCTTGGCCTGTGGGGATATGTGGAGACGGGTGTATT 
TACCCCCACCACACCCATGGAGTTCAACTTCTGGCTGGACGAGCGAAAGCCTCACTTT 

2 0 TATTGGCCCATCTACGTTTGGGGCGTACTGGGCGTGGCAGCTGCCGCCTGGTTGGCC.A 
TTGCAACGGACACCCTGTTCTCCTGGCTGACTCACAATGTGGTGATTCAGTTCCAACT 
ACTGGAGCTTGTTCTCGAAGAGAAGGATCTGAATGGCGGAGACTCTCGCCTGACCGGG 
TTTGTTAGTCGTCATCGTATAGCTCTGGATTTGGCCAAGGAACTAAGTTCGATTTTCG 
GGGAGATCGTCTTTGTGAAATACATGCTCAGTTACCTGCAACTCTGCATGTTGGCCTT 

2 5 TCGCTTCAGCCGCAGTGGCTGGAGTGCCCAGGTGCCATTTAGAGCCACCTTCCTAGTG 

GCCATCATCATCCAACTGAGTTCGTATTGCTATGGAGGCGAGTATATAAAGCAGCAAA 
GTTTGGCCATCGCACAAGCCGTTTATGGTCAAATCAATTGGCCAGAAATGACGCCAAA 
GAAAAGAAGACTCTGGCAAATGGTGATCATGAGGGCGCAGCGACCGGCTAAGATTTTT 
GGATTCATGTTCGTTGTGGACTTGCCACTGCTGCTTTGGGTCATCAGAACTGCGGGCT 

3 0 CATTTCTGGCCATGCTTAGGACTTTCGAGCGT 

DOR59 

MHEADNREMELLVATQAYTRT I TLL I W I PS VI AGLMAYSDC I YRS LFLPKSVFNVPAV 
RRGEEHPILLFQLFPFGELCDNFWGYLGPWYALGLGITAIPLWHTFITCLMKYVNLK 
3 5 LQ I LNKR VEEMD I TRLNSKLVIGRL.TASELTFWQMQLFKEFVKEQLRIRKFVQELQYL 
ICVPVMADFI I FSVLI CFLFFALTVGHDELSLAYFSCGWYNFEMPLQKMLVFMMMHAQ 
RPMKMRALLVDLNLRTFIDIGRGAYSYFNLLRSSHLY 
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DOR59nt 

ATGCACGAAGCAGATAATCGGGAGATGGAACTTTTGGTCGCCACTCAGGCTTATACAC 
GAACCATTACCCTGTTGATCTGGATACCATCGGTTATTGCTGGCCTAATGGCCTATTC 
AGACTGCATCTACAGGAGTCTGTTTCTGCCGAAATCGGTTTTCAATGTGCCAGCTGTG 
5 CGACGTGGTGAGGAGCATCCCATTCTGCTATTTCAGCTGTTTCCCTTCGGAGAACTTT 
GCGATAACTTCGTTGTTGGATACTTGGGACCTTGGTATGCTCTGGGCCTGGGAATCAC 
GGCTATCCCATTGTGGCACACCTTTATCACTTGCCTCATGAAGTACGTAAATCTCAAG 
CTGCAAATACTCAACAAGCGAGTGGAGGAGATGGATATTACCCGACTTAATTCCAAAT 
TGGTAATTGGTCGCCTAACTGCCAGTGAGTTAACCTTCTGGCAAATGCAACTCTTCAA 

1 0 GGAATTTGTAAAGGAACAGCTGAGGATTCGAAAATTTGTCCAGGAACTACAGTATCTG 
ATTTGCGTGCCTGTGATGGCAGATTTCATTATCTTCTCGGTTCTCATTTGCTTTCTCT 
TTTTTGCCTTGACAGTTGGCCACGATGAACTGAGCCTTGCTTACTTTTCTTGCGGATG 
GTACAACTTCGAAATGCCTTTGCAGAAAATGCTGGTTTTTATGATGATGCATGCCCAA 
AGGCCGATGAAGATGCGCGCCCTGCTGGTCGATTTGAATCTGAGGACCTTCATAGACA 

1 5 TTGGCCGTGGAGCCTACAGCTACTTCAATTTGCTGCGTAGCTCCCACTTGTAT 



MGHKDDMDSTDSTALSLKHIS SLI FVI SAQYPLI S YVAYNRNDMEKVTACLS WFTNM 
LT VI KI S TFLiANRKD F WEM I HRFRKMHEQCKYREGLDYVAEANKLAS FLGRAYCVSCG 

2 0 LTGLY FMLGP I VKI GVCRWHGTTCDKEL PMPMKF PFNDLES PGYEVCFLYTVLVTVVV 

VAYASAVDGL F I SFAINLRAHFQTLQRQ I ENWEFPS SEPDTQ I RLKS I VEYHVLLLSL 
SRKiRSIYTPTVMGQFVITSLQVGVIIYQLVTNl^SVMDLLLYASFFGSIMLQLFIYC 
YGGEI IKAESLQVDTAVRLSNWHLASPKTRTSLSLI ILQSQKEVLIRAGFFVASLANF 
PYRLITLIKSIDSIC 

25 

DOR 6 lilt 

♦♦information on nucleotide sequence is in progress** 
DOR 6 2 

3 0 MEKQEDFKI^THSAVYYHWRVWELTGIJ^RPPGVSSLLYW 

ARLLFTTNMAGLCENLTITITDIVANLKFANVY>r7RKQIiIEIRSLLRLMDARARLVGD 
PEE I SALRKEVNI AQGTFRTFAS I FVFGTTLS CVRVWRPDRELLYPAWFGVDWMHST 
RNYVLINIYQLFGLIVQAIQNCASDSYPPAFLCLLTGHMRALELRVRRIGCRTEKSNK 
GQTYEAWREEVYQEL I E C I RDLAR VHRLRE 1 1 OjRVLS VPCMAQ FVCS AAVQ CTVAMHF 
3 5 LYVADDHDHTAMIISIVFFSAVTLEVFVICYFGDRMRTQSEALCDAFYDCNWIEQLPK 
FKRELLFTLARTQRPSLIYAGNYIALSLETFEQVMRFTYSVFTLLLRAK 
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DOR62nt 

ATGGAGAAGCAAGAGGATTTCAAACTGAACACCCACAGTGCTGTGTACTACCACTGGC 
GCGTTTGGGAGCTCACTGGCCTGATGCGTCCTCCGGGCGTTTCAAGCCTGCTTTACGT 
GGTATACTCCATTACGGTCAACTTGGTGGTCACCGTGCTGTTTCCCTTGAGCTTGCTG 
5 GCCAGGCTGCTGTTCACCACCAACATGGCCGGATTGTGCGAGAACCTGACCATAACTA 
TTACCGATATTGTGGCCAATTTGAAGTTTGCGAATGTGTACATGGTGAGGAAGCAGCT 
CCATGAGATTCGCTCTCTCCTAAGGCTCATGGACGCTAGAGCCCGGCTGGTGGGCGAT 
CCCGAGGAGATTTCTGCCTTGAGGAAGGAAGTGAATATCGCACAGGGCACTTTCCGCA 
CCTTTGCCAGTATTTTCGTATTTGGCACTACTTTGAGTTGCGTCCGCGTGGTCGTTCG 

1 0 CCCGGATCGAGAGCTCCTGTATCCGGCCTGGTTCGGCGTTGACTGGATGCACTCCACC 
AGAAACTATGTGCTCATCAATATCTACCAGCTCTTCGGCTTGATAGTGCAGGCTATAC 
AGAACTGCGCTAGTGACTCCTATCCGCCTGCGTTTCTCTGCCTGCTCACGGGTCATAT 
GCGTGCTTTGGAGCTGAGGGTGCGGCGGATTGGCTGCAGGACGGAAAAGTCCAATAAA 
GGGCAGACATATGAAGCCTGGCGGGAGGAGGTGTACCAGGAACTCATCGAGTGCATCC 

1 5 GCGATCTGGCGCGGGTCCATCGGCTGAGGGAGATCATTCAGCGGGTCCTTTCAGTGCC 
CTGCATGGCCCAGTTCGTCTGCTCCGCCGCCGTCCAGTGTACCGTCGCCATGCACTTC 
CTGTACGTAGCGGATGACCACGACCACACCGCCATGATCATCTCGATTGTATTTTTCT 
CGGCCGTCACCTTGGAGGTGTTTGTAATCTGCTATTTTGGGGACAGGATGCGGACACA 
GAGCGAGGCGCTGTGCGATGCCTTCTACGATTGCAACTGGATAGAACAGCTGCCCAAG 

2 0 TTCAAGCGCGAACTGCTCTTCACCCTGGCCAGGACGCAGCGGCCTTCTCTTATTTACG 

CAGGCAACTACATCGCACTCTCGCTGGAGACCTTCGAGCAGGTCATGAGGTTCACATA 
CTCTGTTTTCACACTCTTGCTGAGGGCCAAGTAAGAACTTTATAATCTCTTTTTGGGG 
AGAAAAATTTTAAAGCACAATAGCAGAAAAATATATCAGATAATATAACAAAAAAAAA 
AAAAAAAAA 

25 

DOR 6 4 

MKLSETLKI DYFRVQLNAWRI CGALDLSEGRYWSWSMLLC I LVYLPTPMLLRGVYS FE 
DPVENNFSLSLTVTSLSNLMKFCMYVAQLTKMVEVQSLIGQLDARVSGESQSERHRNM 
TEHLLRMSKLFQI TYAWFI IAAVPFVFETELSLPMPMWFPFDWKNSMVAY IGALVFQ 

3 0 EIGYVFQIMQCFAADSFPPLVLYLI S EQCQLL I LRI SE I G YG YKTLEENEQDLVNCI R 

DQNAL YRLLDVTKS LVS YPMMVQFMVI G I N I AI TL F VL I FYVETL YDR I YYLC FLLG I 
TVQTYPLCYYGTMVQESFAELHYAVFCSNWVDQSASYRGHMLILAERTKRMQLLLAGN 
LVP IHLSTYVACWKGAYSFFTLMADRDGLGS 

35 DOR64nt 

GGCACGAGCCAAGAATTCAAAATGAAACTCAGCGAAACCCTAAAAATCGACTATTTTC 
GAGTCCAGTTGAATGCCTGGCGAATTTGTGGTGCCTTGGATCTCAGCGAGGGTAGGTA 
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CTGGAGTTGGTCGATGCTATTGTGCATCTTGGTGTACCTGCCGACACCCATGCTACTG 
AGAGGAGTATACAGTTTCGAGGATCCGGTGGAAAATAATTTCAGCTTGAGCCTGACGG 
TCACATCGCTGTCCAATCTCATGAAGTTCTGCATGTACGTGGCCCAACTAACAAAGAT 
GGTCGAGGTCCAGAGTCTTATTGGTCAGCTGGATGCCCGGGTTTCTGGCGAGAGCCAG 
5 TCTGAGCGTCATAGAAATATGACCGAGCACCTGCTAAGGATGTCCAAGCTGTTCCAGA 
TCACCTACGCTGTAGTCTTCATCATTGCTGCAGTTCCCTTCGTTTTCGAAACTGAGCT 
AAGCTTACCCATGCCCATGTGGTTTCCCTTCGACTGGAAGAACTCGATGGTGGCCTAC 
ATCGGAGCTCTGGTTTTCCAGGAGATTGGCTATGTCTTTCAAATTATGCAATGCTTTG 
CAGCTGACTCGTTTCCCCCGCTCGTACTGTACCTGATCTCCGAGCAATGTCAATTGCT 

1 0 GATCCTGAGAATCTCTGAAATCGGATATGGTTACAAGACTCTGGAGGAGAACGAACAG 
GATCTGGTCAACTGCATCAGGGATCAAAACGCGCTGTATAGATTACTCGATGTGACCA 
AGAGTCTCGTTTCGTATCCCATGATGGTGCAGTTTATGGTTATTGGCATCAACATCGC 
CATCACCCTATTTGTCCTGATATTTTACGTGGAGACCTTGTACGATCGCATCTATTAT 
CTTTGCTTTCTCTTGGGCATCACCGTGCAGACATATCCATTGTGCTACTATGGAACCA 

1 5 TGGTGCAGGAGAGTTTTGCTGAGCTTCACTATGCGGTATTCTGCAGCAACTGGGTGGA 
TCAAAGTGCCAGCTATCGTGGGCACATGCTCATCCTGGCGGAGCGCACTAAGCGGATG 
CAGCTTCTCCTCGCCGGCAACCTGGTGCCCATCCACCTGAGCACCTACGTGGCCTGTT 
GGAAGGGAGCCTACTCCTTCTTCACCCTGATGGCCGATCGAGATGGCCTGGGTTCTTA 
GTAGCCCAGTCATTTCACTCACATTCTACATCAAGTAGTACTACCACTGAACACGAAC 

2 0 ACGAATATTTCAAAAGTAAACACATAATATTCACAATAGTGTATCACTTTAATAAAAT 

TTTTGGTTACCATGAAAAAAAAAAAAAAAAAA 

DOR67 

ML S QFFPH I KEKPLSERVKSRDAFVYLDRVMWS FGWTVPENKRWDLHYKLWSTFVTLV 
25 I F I LLP I S VS VEYI QRFKTFS AGEFLS S I QI GVNMYGSS FKS YLTMMG YKKRQEAKMS 
LDELDKRCVCDEERTIVHRHVALGNFCYIFYHIAYTSFLISNFLSFIMKRIHAPJRMYF 
PYVDPEKQFYI S S I AEVILRGWAVFMDLCTDVCPLI SMVI ARCH I TLLKQRLRNLRSE 
PGRTEDE YLKELADC VRDHRL I LD YVDALRS VFS GT I FVQFLL>I G I VLGL SMINI MFF 
STLSTGVAWLFMSCVSMQTFPFCYLCNMIMDDCQEMADSLFQSDWTSADRRYKSTLV 

3 0 YFLHNLQQPIILTAGGVFPISMQTNLNKVKLAFTVVTIVKQF 

DOR67nt 

GGCACGAGGAAATGTTAAGCCAGTTCTTTCCCCACATTAAAGAAAAGCCATTGAGCGA 
GCGGGTTAAGTCCCGAGATGCCTTCGTTTACTTAGATCGGGTGATGTGGTCCTTTGGC 
3 5 TGGACAGTGCCTGAAAACAAAAGGTGGGATCTACATTACAAACTGTGGTCAACTTTCG 
TGACATTGGTGATATTTATCCTTCTGCCGATATCGGTAAGCGTTGAGTATATTCAGCG 
GTTCAAGACCTTCTCGGCGGGTGAGTTTCTTAGCTCAATCCAGATTGGCGTTAACATG 
TACGGAAGCAGCTTTAAAAGTTATTTGACCATGATGGGATATAAGAAGAGACAGGAGG 
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CTAAGATGTCACTGGATGAGCTGGACAAGAGATGCGTTTGTGATGAGGAGAGGACCAT 
TGTACATCGACATGTCGCCCTGGGAAACTTTTGCTATATTTTCTATCACATTGCGTAC 
ACTAGCTTTTTGATTTCAAACTTTTTGTCATTTATAATGAAGAGAATCCATGCCTGGC 
GCATGTACTTTCCCTACGTCGACCCCGAAAAGCAATTTTACATCTCTAGCATCGCCGA 
5 AGTCATTCTTAGGGGGTGGGCCGTCTTCATGGATCTCTGCACGGATGTGTGTCCTTTG 
ATCTCCATGGTAATAGCACGATGCCACATCACCCTTCTGAAACAGCGCCTGCGAAATC 
TACGATCGGAACCAGGAAGGACGGAAGATGAGTACTTGAAGGAGCTCGCCGACTGCGT 
TCGAGATCACCGCTTGATATTGGACTATGTCGACGCATTGCGATCCGTCTTTTCGGGG 
ACAATTTTTGTGCAGTTCCTCTTGATCGGTATTGTACTGGGTCTGTCAATGATAAATA 

1 0 TAATGTTTTTCTCAACACTTTCGACTGGTGTCGCCGTTGTCCTTTTTATGTCCTGCGT 
ATCTATGCAGACGTTCCCCTTTTGCTATTTGTGTAACATGATTATGGATGACTGCCAA 
GAGATGG C CGACTCCCTTTTTCAAT CGGACTGGACATCTGC CGATCGTCGCTACAAAT 
CCACTTTGGTATACTTTCTTCACAATCTTCAGCAGCCCATTATTCTTACGGCTGGTGG 
AGTCTTTCCTATTTCCATGCAAACAAATTTAAATATGGTGAAGCTGGCCTTTACTGTG 

1 5 GTTACAATAGTGAAACAATTTAACTTGGCAGAAAAGTTTCAATAAGTTAAGATATGCA 
AGCTCTGCTATTATAAACCTACACTCGAGAAAATATTTCTTCACATTAATAAACCTTC 
AGTACTTACTGCTTGTGGCGCCCCCGGAAAAAAAAAAAAAAAAAA 

DOR68 

2 0 MS KL I EVFLGNLWTQRFTFARMGLDLQPDKKGNVLRS PLLYC I MCL.TTS FELCTVCAF 
MVQNRNQ I VLC S EALMHGLQMVS SLLKMAI FLAKSHDLVDL IQQIQS PFTEEDLVGTE 
WRS QNQRGQLMAAI YFMMCAGTS VS FLLMPVALTMLKYHS TGEFAPVS S FRVLLPYDV 
TQ PHVYAMDCCLMVFVLS FFCCS TTGVDTLYGWCALGVS LQYRRLGQQLKRI P S CFNP 
SRSDFGLSGI FVEHARLLKI VQHFNYS FME I AFVE WI I CGLYCSVI CQYI MPHTNQN 

2 5 FAFLGFFSLWTTQLCI YLFGAEQVRLEAERFSRLLYEVIPWQNLPPKHRKLFLFPIE 

RAQRETVLGAYFFELGRPLLVWVS I FLF I VLLF 

DQRSSnt 

ATGTCAAAGCTAATCGAGGTGTTTCTGGGTAATCTGTGGACCCAGCGTTTTACCTTCG 

3 0 CCCGAATGGGTTTGGATTTGCAGCCCGATAAAAAGGGCAATGTTTTGCGATCTCCGCT 

TCTTTATTGTATTATGTGTCTGACAACAAGCTTTGAGCTCTGCACCGTGTGCGCCTTT 
ATGGTCCAAAATCGCAACCAAATCGTGCTTTGTTCCGAGGCCCTGATGCACGGACTAC 
AGATGGTCTCCTCGCTACTGAAGATGGCTATATTCTTGGCCAAATCTCACGACCTGGT 
GGACCTAATTCAACAGATTCAGTCGCCTTTTACAGAGGAGGATCTTGTAGGTACAGAG 
3 5 TGGAGATCCCAAAATCAAAGGGGACAACTAATGGCTGCCATTTACTTTATGATGTGTG 
CCGGTACGAGTGTGTCATTTCTGTTGATGCCAGTGGCTTTGACCATGCTTAAGTACCA 
TTCCACTGGGGAATTCGCGCCTGTCAGCTCGTTCCGGGTTCTGCTTCCATACGATGTG 
ACACAACCGCATGTTTATGCCATGGACTGCTGCTTGATGGTATTTGTGTTAAGTTTTT 
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TTTGCTGCTCCACCAC CGGAGTGGATACCTTATATGGATGGTGTG CTTTAGGCGTGAG 
TTTACAATACCGTCGCCTCGGTCAACAACTTAAAAGGATACCCTCCTGTTTCAATCCA 
TCTCGGTCTGACTTTGGATTAAGTGGGATTTTTGTGGAGCATGCTCGTCTGCTTAAAA 
TAGTCCAACATTTTAATTATAGTTTTATGGAGATCGCATTTGTGGAGGTTGTTATAAT 
5 CTGTGGACTCTATTGCTCAGTAATTTGTCAGTATATAATGCCACACACCAACCAAAAC 
TTCGCCTTTCTGGGTTTCTTTTCATTGGTAGTTACCACACAGCTGTGCATCTATCTTT 
TCGGTGCCGAACAGGTCCGTTTGGAGGCTGAGCGATTTTCCCGGCTGCTATACGAAGT 
AATTCCTTGGCAAAACCTTCCTCCTAAACACCGGAAACTTTTCCTTTTTCCAATTGAG 
CGCGCCCAACGAGAAACTGTTCTCGGTGCTTATTTCTTCGAACTAGGCAGACCTCTTC 
1 0 TTGTTTGGGTAAGCATATTCCTTTTTATTGTATTATTATTT 

DOR71 

MVI I DSLS F YRPFWI CMRLLVPTFFKDS SRPVQLYWLLH I LVTLWFPLHLLLHLLLL 
PSTAEFFKl^TMSLTCVACSLKHVAHLYHLPQIVEIESLIEQLDTFIASEQEHRYYRD 
1 5 HVHCHARRFTRCLYI SFGMI YALFLFGVFVQVI SGNWELLYPAYFPFDLESNRFLGAV 
ALGYQVFSMLVEGFQGLGNDTYTPLTLCLLAGHVHLWS I RMGQLGYFDDETWNHQRL 
LDYIEQHKLLVRFHNLVSRTISEVQLVQLGGCGATLCIIVSYMLFFVGDTISLVYYLV 
FFGWCVQLFPSCYFASEVAEELERLPYAIFSSRWYDQSRDHRFDLLIFTQLTLGNRG 
WIIKAGGLIELNLNAFFATLKMAYSLFAWHRETGNPLQREH 

20 

DOR71nt 

ATGGTCATTATCGACAGTCTTAGTTTTTATCGTCCATTCTGGATCTGCATGCGATTGC 
TGGTACCGACTTTCTTCAAGGATTCCTCACGTCCTGTCCAGCTGTACGTGGTGTTGCT 
GCACATCCTGGTCACCTTGTGGTTTCCACTGCATCTGCTGCTGCATCTTCTGCTACTT 

2 5 CCATCTACCGCTGAGTTCTTTAAGAACCTGACCATGTCTCTGACTTGTGTGGCCTGCA 

GTCTGAAGCATGTGGCCCACTTGTATCACTTGCCGCAGATTGTGGAAATCGAATCACT 
GATCGAGCAATTAGACACATTTATTG C CAG CGAACAGGAGCATCGTTACTATCGGGAT 
CACGTACATTGCCATGCTAGGCGCTTTACAAGATGTCTCTATATTAGCTTTGGCATGA 
TCTATGCGCTTTTCCTGTTCGGCGTCTTCGTTCAGGTTATTAGCGGAAATTGGGAACT 

3 0 TCTCTATCCAGCCTATTTCCCATTCGACTTGGAGAGCAATCGCTTTCTCGGCGCAGTA 

GCCTTGGGCTATCAGGTATTCAGCATGTTAGTTGAAGGCTTCCAGGGGCTGGGCAACG 
ATAC CT ATAC C C CACTGAC C CTATGCCTTCTGGCCGGACATGTCC ATTTGTGGTCCAT 
ACGAATGGGTCAACTGGGATACTTCGATGACGAGACGGTGGTGAATCATCAGCGTTTG 
CTGGATTACATTGAGCAGCATAAACTCTTGGTGCGGTTCCACAACCTGGTGAGCCGGA 
3 5 CCATCAGCGAAGTGCAACTGGTGCAGCTGGGCGGATGTGGAGCCACTCTGTGCATCAT 
TGTCTCCTACATGCTCTTCTTTGTGGGCGACACAATCTCGCTGGTCTACTACTTGGTG 
TTCTTTGGAGTGGTCTGCGTGCAGCTCTTTCCCAGCTGCTATTTTGCCAGCGAAGTAG 
CCGAGGAGTTGGAACGGCTGCCATATGCGATCTTCTCCAGCAGATGGTACGATCAATC 
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GCGGGATCATCGATTCGATTTGCTCATCTTTACACAATTAACACTGGGAAACCGGGGG 
TGGATCATCAAGGCAGGAGGTCTTATCGAGCTGAATTTGAATGCCTTTTTCGCCACCC 
TGAAGATGGCCTATTCCCTTTTTGCAGTTGTGGTGCGGGCAAAGGGTATA 

5 DOR72 

MDLKPRVI RSEDI YRTYWLYWHLLGLESNFFLNRLLDLVITI FVT I WYP I HL I LGLFM 
ERSLGDVCKGLP ITAACFFASFKFICFRFKLSEI KE IE I LFKELDQRALSREECEFFN 
QNTRREANF I WKS F I VAYGLSNI S AIAS VLFGGGHKLL YPAWFPYDVQATEL I FWLS V 
TYQIAGVSLAI LQNLANDSYPPMTFCWAGHVRLLAMRLSRI GQGPEETI YLTGKQLI 
10 ESIEDHRKLMKIVELLRSTMNISQLGQFISSGVNISITLVNILFFADNNFAITYYGVY 
FLSMVLELFPCCYYGTLISVEMNQLTYAIYSSNWMSMNRSYSRILLIFMQLTLAEVQI 
KAGGMI GI GMNAFFATVRLAYSFFTLAMSLR 

DOR72nt 

1 5 ATGGACTTAAAACCGCGAGTCATTCGAAGTGAAGATATCTACAGAACCTATTGGTTAT 
ATTGGCATCTTTTGGGCCTGGAAAGCAATTTCTTTCTGAATCGCTTGTTGGATTTGGT 
GATTACAATTTTCGTAACCATTTGGTATCCAATTCACCTGATTCTGGGACTGTTTATG 
GAAAGATCTTTGGGGGATGTCTGCAAGGGTCTACCAATTACGGCAGCATGCTTTTTCG 
CCAGCTTTAAATTTATTTGTTTTCGCTTCAAGCTATCTGAAATTAAAGAAATCGAAAT 

2 0 ATT^TTTAAAGAGCTGGATCAGCGAGCTTTAAGTCGAGAGGAATGCGAGTTTTTCAAT 
CAAAATAC GAGACGTGAG G CGAATTTCATTTGGAAAAGTTTCATTGTGGCCTATGGAC 
TGTCGAATATCTCGGCTATTGCATCAGTTCTTTTCGGCGGTGGACATAAGCTATTATA 
TCCCGCCTGGTTTCCATACGATGTGCAGGCCACGGAACTAATATTTTGGCTAAGTGTA 
ACATACCAAATTGCCGGAGTAAGTTTGGCCATACTTCAGAATTTGGCCAATGATTCCT 

2 5 ATCCACCGATGACATTTTGCGTGGTTGCCGGTCATGTAAGACTTTTGGCGATGCGCTT 

GAGTAGAATTGGCCAAGGTCCAGAGGAAACAATATACTTAACCGGAAAGCAATTAATC 
GAAAGCATCGAGGATCACCGAAAACTAATGAAGATAGTGGAATTACTGCGCAGCACCA 
TGAATATTTCGCAGCTCGGCCAGTTTATTTCAAGTGGTGTTAATATTTCCATAACACT 
AGTCAACATTCTCTTCTTTGCGGATAATAATTTCGCTATAACCTACTACGGAGTGTAC 

3 0 TTCCTATCGATGGTGTTGGAATTATTCCCGTGCTGCTATTACGGCACCCTGATATCCG 

TGGAGATGAACCAGCTGACCTATGCGATTTACTCAAGTAACTGGATGAGTATGAATCG 
GAGCTACAGCCGCATCCTACTGATCTTCATGCAACTCACCCTGGCGGAAGTGCAGATC 
AAGGCCGGTGGGATGATTGGCATCGGAATGAACGCCTTCTTTGCCACCGTGCGATTGG 
CCTACTCCTTCTTCACTTTGGCCATGTCGCTGCGT 

35 
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DOR73 

MDSRRKVRS ENL YKTYWL YWRLLGVEGDYPFRRLVDFTI T S F I TI LFPVHL I LGMYKK 
PQ I QVFRSLHFTSECLFCSYKFFCFRWKLKE I KTI EGLLQDLDSRVESEEERNYFNQN 
PSRVARMLSKSYLVAAISAIITATVAGLFSTGRNLMYLGWFPYDFQATAAIYWISFSY 
5 QAIGSSLLILENIANDSYPPITFCWSGHVRLLIMRLSRIGHDVKLSSSENTRKLIEG 
I QDHRKLMKI I RLLRSTLHLS QLGQFLS SGINISITLINI LFFAENNFAMLYYAVFFA 
AMLI ELFPS CYYG ILMTMEFDKLPYAI FS SNWLKMDKRYNRSL 1 1 LMQLTLVPVNI KA 
GGIVGIDMSAFFATVRMAYSFYTLALSFRV 

10 DOR73nt 

ATGGATTCAAGAAGGAAAGTCCGAAGTGAAAATCTTTACAAAACCTATTGGCTTTACT 
GGCGACTTCTGGGAGTCGAGGGCGATTATCCTTTTCGACGGCTAGTGGATTTTACAAT 
CACGTCTTTCATTACGATTTTATTTCCCGTGCATCTTATACTGGGAATGTATAAAAAG 
CCCCAGATTCAAGTCTTCAGGAGTCTGCATTTCACATCGGAATGCCTTTTCTGCAGCT 

1 5 ATAAGTTTTTCTGTTTTCGTTGGAAACTTAAAGAAATAAAGACCATCGAAGGATTGCT 
CCAGGATCTCGATAGTCGAGTTGAAAGTGAAGAAGAACGCAACTACTTTAATCAAAAT 
CCAAGTCGTGTGGCTCGAATGCTTTCGAAAAGTTACTTGGTAGCTGCTATATCGGCCA 
TAATCACTGCAACTGTAGCTGGTTTATTTAGTACTGGTCGAAATTTAATGTATCTGGG 
TTGGTTTCCCTACGATTTTCAAGCAACCGCCGCAATCTATTGGATTAGTTTTTCCTAT 

2 0 CAGGCGATTGGCTCTAGTCTGTTGATTCTGGAAAATCTGGCCAACGATTCATATCCGC 
CGATTACATTTTGTGTGGTCTCTGGACATGTGAGACTATTGATAATGCGTTTAAGTCG 
AATTGGTCACGATGTAAAATTATCAAGTTCGGAAAATACCAGAAAACTCATCGAAGGT 
ATCCAGGATCACAGGAAACTAATGAAGATAATACGCCTACTTCGCAGCACTTTACATC 
TTAGCCAACTGGGCCAGTTCCTTTCTAGTGGAATCAACATTTCCATAACACTCATCAA 

2 5 CATCCTGTTCTTTGCGGAAAACAACTTTGCAATGCTTTATTATGCGGTGTTCTTTGCT 

GCAATGTTAATAGAACTATTTCCAAGTTGTTACTATGGAATTCTGATGACAATGGAGT 
TTGATAAGCTACCATATGCCATCTTCTCCAGCAACTGGCTTAAAATGGATAAAAGATA 
CAATCGATCCTTGATAATTCTGATGCAACTAACACTGGTTCCAGTGAATATAAAAGCA 
GGTGGTATTGTTGGCATCGATATGAGTGCATTTTTTGCCACAGTTCGGATGGCATATT 

3 0 CCTTTTACACTTTAGCCTTGTCATTTCGAGTA 

DOR77 

MELMRVPVQFYRTIGEDIYAHRSTNPLKSLLFKIYLYAGFINFNLLVIGELVFFYNSI 
QDFETIRIAIAVAPCIGFSLVADFKQAAMIRGKKTLIMLLDDLENMHPKTLAKQMEYK 
3 5 LPDFEPCTMKRVINIFTFLCLAYTTTFSFYPAIKASVKFNFLGYDTFDRNFGFLIWFPF 
DATRNNL I YW I MYWD I AHGAYLAAFQVTE STVEVI IIYCI FLMT SMVQVFMVC YYGDT 
LIAASLKVGDAAYNQKWFQCSKSYCTMLKLLIMRSQKPASIRPPTFPPISLVTYMKNP 
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FNNLPKHS S SLQ INANRYI 
noR77nt 

ATGGAATTGATGCGAGTGCCAGTACAGTTTTACAGAACGATTGGAGAGGATATCTACG 
CCCATCGATCCACGAATCCCCTAAAATCGCTTCTCTTCAAGATCTATCTATATGCGGG 
ATTCATAAATTTTAATCTGTTGGTAATCGGTGAACTGGTGTTCTTCTACAACTCAATT 
CAGGACTTTGAAACCATTCGATTGGCCATCGCGGTGGCTCCATGTATCGGATTTTCTC 
TGGTTGCTGATTTTAAACAAGCTGCCATGATTAGAGGCAAGAAAACACTAATTATGCT 
ACTCGATGATTTGGAGAACATGCATCCGAAAACCCTGGCAAAGCAAATGGAATACAAA 
TTGCCGGACTTTGAAAAGACCATGAAACGTGTGATCAATATATTCACCTTTCTCTGCT 
TGGCCTATACGACTACGTTCTCCTTTTATCCGGCCATCAAGGCATCCGTGAAATTTAA 
TTTCTTGGGCTACGACACCTTTGATCGAAATTTTGGTTTCCTCATCTGGTTTCCCTTC 
GATGCAACAAGGAATAATTTGATATACTGGATCATGTACTGGGACATAGCCCATGGGG 
CCTATCTAGCGGCCTTTCAGGTCACCGAATCAACAGTGGAAGTGATTATTATTTACTG 
CATTTTTTTGATGACCTCGATGGTTCAGGTATTTATGGTGTGCTACTATGGGGATACT 
TTAATTGCCGCGAGCTTGAAAGTGGGCGATGCCGCTTACAACCAAAAGTGGTTTCAGT 
GCAGCAAATCCTATTGCACCATGTTGAAGTTGCTAATCATGAGGAGTCAGAAACCAGC 
TTCAATAAGACCGCCGACTTTTCCCCCCATATCCTTGGTTACCTATATGAAGAATCCC 
TTCAACAATCTACCCAAACACAGCTCTTCCCTGCAAATCAACGCCAATCGCTATATC 

DOR7 8 

MKFMKTAVFFYTSVGIEPYTIDSRSKKASLWSHLLFWANVINIjSVIVFGEILYLGVAY 
SDG KF I DAVTVLS Y I GFVI VGMS KMF F I WWKKTDLSDLVKELiEH I Y PNGKAEEEMYRL 
DRYLRS CSRI S I TYALLYS VLI WTFNLFS I MQFLVYEKLLKI RWGQTLP YLMYF PWN 
WHENWTYYVLLFCQNFAGHTSASGQISTDLLLCAVATQVVMHFDYLARVVEKQVLDRD 
WSENSRFLAKTVQYHQRILRI^VLTTOIFGIPLLLNFMVSTFVICFVGFQMTVGVPPD 
IMI KLFLFLFS SLSQVYLI CHYGQL I ADAVRDFRS S SLS I S AYKQNWQNAD IRYRRAL 
VFFI ARPQRTTYLKATI FMNI TRATMTDVRYNLKCH 

DOR7 8nt 

ATGAAGTTCATGAAGTACGCAGTTTTCTTTTACACATCGGTGGGCATTGAGCCGTATA 
CGATTGACTCGCGGTCCAAAAAAGCGAGCCTATGGTCACATCTTCTCTTCTGGGCCAA 
TGTGATCAATTTAAGTGTCATTGTTTTCGGAGAGATCCTCTATCTGGGAGTGGCCTAT 
TCCGATGGAAAGTTCATTGATGCCGTCACTGTACTGTCATATATCGGATTCGTAATCG 
TGGGCATGAGCAAGATGTTCTTCATATGGTGGAAGAAGACCGATCTAAGCGATTTGGT 
TAAGGAATTGGAGCACATCTATCCAAATGGCAAAGCTGAGGAGGAGATGTATCGGTTG 
GATAGGTATCTGCGATCTTGTTCACGAATTAGCATTACCTATGCACTACTCTACTCCG 
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TACTCATCTGGACCTTCAATCTGTTCAGTATCATGCAATTCCTTGTCTATGAAAAGTT 
GCTTAAAATCCGAGTGGTCGGCCAAACGCTGCCATATTTGATGTACTTTCCCTGGAAC 
TGGCATGAAAACTGGACGTATTATGTGCTGCTGTTCTGTCAAAACTTCGCAGGACATA 
CTTCGGCATCGGGACAGATCTCTACGGATCTTTTGCTTTGTGCTGTTGCTACCCAGGT 
5 GGTAATGCACTTCGATTACTTGGCCAGAGTGGTGGAAAAACAAGTGTTAGATCGCGAT 
TGGAGCGAAAACTCCAGATTTTTGGCAAAAACTGTACAATATCATCAGCGCATTCTTC 
GGCTAATGGACGTTCTCAACGATATATTCGGGATACCGCTACTGCTTAACTTTATGGT 
CTCCACATTTGTCATCTGCTTTGTGGGATTCCAAATGACCGTGGGTGTCCCGCCGGAC 
ATCATGATTAAGCTCTTCTTGTTCCTGTTCTCGTCCTTGTCGCAAGTGTACTTGATAT 
1 0 GCCACTACGGCCAGCTGATTGCCGATGCGGTAAGAGACTTTCGAAGCTCTAGCTTATC 
GATTTCTGCATATAAGCAGAATTGGCAAAATGCTGACATTCGCTATCGTCGGGCTCTG 
GTATTCTTTATAGCTCGACCTCAGAGGACAACTTATCTAAAAGCTACAATTTTCATGA 
ATATAACAAGGGCCACCATGACGGACGTAAGATACAATTTGAAATGTCAT 

15 DOR 81 

MMETLRNSGLNLKKDFG I GRKI WRVFS FTYNMVI LPVS FP INYVI HLAEFPPELLLQS 
LQLCLITTWCFALKFFTLIVYTHRLEI^KHFDELDKYCWPAEKRKVRDMVATITRLY 
LTF WVYVL YAT S TLLDGLLHHRVP YNTYYPF I NWRVDRTQMY I Q S FLE YFTVG YA I Y 
VATATDSYPVIYVAALRTHILLLKDRIIYLGDPSNEGSSDPSYMFKSLVDCIKAHRTM 
2 0 LNFCDAIQPIISGTIFAQFIICGSILGIIMINMVLFADQSTRFGIVIYVMAVLLQTFP 
LCFYCNAI VDDCKELAHALFHS AWWVQDKRYQRTVI QFLQKLQQPMTFTAMNI FNINL 
ATNINVSPLLSVRTGKEAKSELQSLQVAKFAFTVYAI ASGMNLDQKLS I KE 

DOR81nt 

2 5 ATGATGGAGACGCTGCGAAATTCGGGCTTGAATTTGAAGAACGATTTCGGTATAGGCC 

GCAAGATTTGGAGGGTGTTTTCGTTCACCTACAATATGGTGATACTTCCCGTAAGTTT 
CCCAATCAACTATGTGATACATCTGGCGGAGTTCCCGCCGGAGCTGCTGCTGCAATCC 
CTGCAACTGTGCCTCAACACTTGGTGCTTCGCTCTGAAGTTCTTCACTCTGATCGTCT 
ATACGCACCGCTTGGAGCTGGCCAACAAGCACTTTGACGAATTGGATAAGTACTGCGT 
•3 0 GAAGCCGGCGGAGAAGCGCAAGGTTCGCGACATGGTGGCCACTATTACAAGACTGTAC 
CTGACCTTCGTCGTGGTCTACGTCCTCTACGCCACCTCCACGCTACTGGACGGACTAC 
TGCACCACCGTGTTCCCTACAATACGTACTATCCGTTCATAAACTGGCGAGTCGATCG 
GACCCAGATGTACATCCAGAGTTTTCTGGAGTACTTCACCGTGGGTTATGCCATATAT 
GTGGCCACCGCCACCGATTCCTACCCTGTGATTTACGTGGCAGCCCTGCGAACTCATA 

3 5 TTCTCTTGCTCAAGGACCGTATCATTTACTTGGGCGATCCCAGCAACGAGGGTAGCAG 

CGACCCGAGCTACATGTTTAAATCGTTGGTGGATTGTATCAAGGCACACAGAACCATG 
CTAAAGTGCAGTTTTTGTGATGCCATTCAACCAATCATCTCTGGCACGATATTTGCCC 
AATTCATCATATGCGGATCGATCCTGGGCATAATTATGATCAACATGGTATTGTTCGC 



TGATCAATCGACCCGATTCGGCATAGTCATCTACGTTATGGCCGTCCTTCTGCAGACT 
TTTCCGCTTTGCTTCTACTGCAACGCCATCGTGGACGACTGCAAAGAACTGGCCCACG 
CACTTTTCCATTCCGCCTGGTGGGTGCAGGACAAGCGATACCAGCGGACTGTCATCCA 
GTTCCTGCAGAAACTGCAGCAGCCCATGACCTTCACCGCCATGAACATATTTAACATT 
5 AATTTGGCCACTAACATCAATGTAAGTCCACTGCT CTCGGTTAGAACGGGGAAGGAAG 
CAAAGTCCGAACTTCAATCCTTGCAGGTAGCCAAGTTCGCCTTCACCGTGTACGCCAT 
CGCGAGCGGTATGAACCTGGACCAAAAGTTAAGCATTAAGGAA 

DOR 8 2 

1 0 MACI PRYQWKGRPTERQFYASEQRIVFLLGTI CQIFQITGVLI YWYCNGRLATETGTP 
VAQLSEMCSSFCLTFVGFCNVYAISTNRNQIETLLEELHQIYPRYRKNHYRCQHYFDM 
AMT I MRI E FL F YM I L YVYYNS APLWVLLWEHLHEE YDLS FKTQTNTWFPWKVHGSALG 
FGMAVLS I TVG S FVGVGFS I VTQNLI CLLTFQLKLHYDG I S S QLVSLDCRRPGAHKEL 
S ILI AHHSRI LQLGDQVNDI MNFVFGS SLVGATI AI CMS SVS IMLLD1ASAFKYASGL 

1 5 VAFVLYNFVICYMGTEVTLAVKIGSYMDGRRWIPKDSLLRSQRLQVLVAVGFFNICVL 
SNRRPKIEILLRYYYHIMFYSFKLYFSLRKGSLWKILSSFTLLRI 



ATGGCATGCATACCAAGATATCAATGGAAAGGACGCCCTACTGAAAGACAGTTCTACG 
2 0 CTTCGGAGCAAAGGATAGTGTTCCTTCTTGGAACCATTTGCCAGATATTCCAGATTAC 
TGGAGTGCTTATCTATTGGTATTGCAATGGCCGTCTTGCCACGGAAACGGGCACCTTT 
GTGGCACAATTATCTGAAATGTGCAGTTCTTTTTGTCTAACATTTGTGGGATTCTGTA 
ACGTTTATGCGATCTCTACAAACCGCAATCAAATTGAAACATTACTCGAGGAGCTTCA 
TCAGATATATCCGAGATACAGGAAAAATCACTATCGCTGCCAGCATTATTTTGACATG 

2 5 GCCATGACAATAATGAGAATTGAGTTTCTTTTCTATATGATCTTGTACGTGTACTACA 

ATAGTGCACCATTATGGGTGCTTCTTTGGGAACACTTGCACGAGGAATATGATCTTAG 
CTTCAAGACGCAGACCAACACTTGGTTTCCATGGAAAGTCCATGGGTCGGCACTTGGA 
TTTGGTATGGCTGTACTAAGCATAACCGTGGGATCCTTTGTGGGCGTAGGTTTCAGTA 
TTGTCACCCAGAATCTTATCTGTTTGTTAACCTTCCAACTAAAGTTGCACTACGATGG 

3 0 AATATCCAGTCAGTTAGTATCTCTCGATTGCCGTCGTCCTGGAGCTCATAAGGAGTTG 

AGCATCCTCATCGCCCACCACAGCCGAATCCTTCAGCTGGGCGACCAAGTCAATGACA 
TAATGAACTTTGTATTCGGCTCTAGCCTAGTAGGTGCCACTATTGCCATTTGTATGTC 
AAGTGTTTCTATAATGCTACTGGACTTAGCATCTGCCTTCAAATATGCCAGTGGTCTA 
GTGGCATTCGTCCTCTACAACTTTGTCATCTGCTACATGGGAACCGAGGTCACTTTAG 
3 5 CTGTGAAGATTGGTTCATATATGGACGGAAGGCGGTGGATACCCAAAGATTCGTTGCT 
GAGATCTCAGAGGCTACAGGTGCTCGTCGCAGTTGGATTTTTTAATATATGTGTCCTC 
TCGAATCGTCGTCCTAAAATTGAAATTTTGCTTAGATATTATTACCATATTATGTTTT 
ATTCATTTAAATTATATTTTTCTTTAAGGAAAGGTAGCCTTTGGAAAATCTTGTCTTC 



TTTCACCTTATTGAGGATC 



DOR83 

MQLEDFMRyPDLVCQAAQLPRYTWNGRRSLEVKRNLAKRIIFWLGAVNLVYHNIGCVM 
5 YGYFGDGRTKDPIAYLAELASVASMLGFTIVGTLNLWKMLSLKTHFENLLNEFEELFQ 
LIKHRAYRIHHYQEKYTRHIRNTFIFHTSAWYYNSLPILLMIREHFSNSQQLGYRIQ 
SNTWYPWQVQGSIPGFFAAVACQIFSCQTNMCVNMFIQFLINFFGIQLEIHFDGLARQ 
LETIDARNPHAKDQLKYLIVYHTKLLNLADRVNRSFNFTFLISLSVSMISNCFLAFSM 
TMFD FGT S LKHLLGLLLF I T YNF SMCRS GTHL I LTS GKVLPAAF YNNWYEGDLVYRRM 
1 0 LLI LMMRATKPYMWKTYKLAPVS ITTYMAECKTKEAHEQRHFRRHERQKPRVARI 

DOR83nt 

ATGCAGTTGGAGGACTTTATGCGGTACCCGGACCTCGTGTGTCAAGCGGCCCAACTTC 
CCAGATACACGTGGAATGGCAGACGATCCTTGGAAGTTAAACGCAACTTGGCAAAACG 

1 5 CATTATCTTCTGGCTTGGAGCAGTAAATTTGGTTTATCACAATATTGGCTGCGTCATG 
TATGGCTATTTCGGTGATGGAAGAACAAAGGATCCAATTGCGTATTTAGCTGAATTGG 
CATCTGTGGCCAGCATGCTTGGTTTCACCATTGTGGGCACCCTCAACTTGTGGAAGAT 
GCTGAGCCTTAAGACCCATTTTGAGAACCTACTAAATGAATTCGAGGAATTATTTCAA 
CTAATCAAGCACAGGGCGTATCGCATACACCACTATCAAGAAAAGTATACGCGTCATA 

2 0 TACGAAATACATTTATTTTCCATACCTCTGCCGTTGTCTACTACAACTCACTACCAAT 
TCTTCTAATGATTCGGGAACATTTCTCGAACTCACAGCAGTTGGGCTATAGAATTCAG 
AGTAATACCTGGTATCCCTGGCAGGTTCAGGGATCAATTCCTGGATTTTTTGCTGCAG 
TCGCCTGTCAAATCTTTTCGTGCCAAACCAATATGTGCGTCAATATGTTTATCCAGTT 
TCTGATCAACTTTTTTGGTATCCAGCTAGAAATACACTTCGATGGTTTGGCCAGGX2AG 

2 5 CTGGAGACCATCGATGCCCGCAATCCCCATGCCAAGGATCAATTGAAGTATCTGATTG 

TATATCACACAAAATTGCTTAATCTAGCCGACAGAGTTAATCGATCGTTTAACTTTAC 
GTTTCTCATAAGTCTGTCGGTATCCATGATATCCAACTGTTTTCTGGCATTTTCCATG 
ACCATGTTCGACTTTGGCACCTCTCTAAAACATTTACTCGGACTTTTGCTATTCATCA 
CATATAATTTTTCAATGTGCCGCAGTGGTACGCACTTGATTTTAACGAGTGGCAAAGT 

3 0 ATTGCCAGCGGCCTTTTATAACAATTGGTATGAAGGCGATCTTGTTTATCGAAGGATG 

CTCCTCATCCTGATGATGCGTGCTACGAAACCTTATATGTGGAAAACCTACAAGCTGG 
CACCTGTATCCATAACTACATATATGGCAGAATGCAAAACAAAAGAAGCCCATGAACA 
ACGCCATTTTAGACGCCATGAAAGACAAAAACCTCGGGTTGCACGAATA 

35 DOR 84 

MVF S FYAEVATLVDRLRDNENFLESC I LL S YVS FVVMGLS KIGAVMKKKPKMTALVRQ 
LETC FPS P S AKVQEEYAVKS WLKRCH I YTKGFGGLFM I MYFAHAL IPLFIYFI QRVLL 
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HYPDAKQIMPFyQLEPWEFRDSWLFYPSYFHQSSAGYTATCGSIAGDLMIFAWLQVI 
MHYERLAKVLREFKI QAHNAPNGAKED IRKLQSLVANHID ILRLTDLMNEVFGI PLLL 
NFIASALLVCLVGVQLTIALSPEYFCKQMLFLISVLLEVYLLCSFSQRLIDAVC 

5 DOR84nt 

ATGGTGTTTAGTTTTTATGCCGAGGTAGCGACTCTGGTGGACAGGTTACGCGATAATG 
AAAATTTTCTCGAGAGCTGCATCTTACTGAGCTACGTGTCCTTTGTGGTCATGGGCCT 
CTCCAAGATAGGTGCTGTAATGAAAAAAAAGCCAAAAATGACAGCTTTGGTCAGGCAA 
TTGGAGACCTGCTTTCCGTCGCCAAGTGCAAAGGTTCAAGAGGAATATGCTGTGAAGT 

1 0 CCTGGCTGAAACGCTGCCATATATACACAAAGGGATTTGGTGGTCTCTTCATGATCAT 
GTATTTCGCTCACGCTCTGATTCCCTTATTCATATACTTCATTCAAAGAGTGCTGCTC 
CACTATCCGGATGCCAAGCAGATTATGCCGTTTTACCAACTCGAACCTTGGGAATTTC 
GCGACTCCTGGTTGTTTTATCCAAGCTATTTTCACCAGTCGTCGGCCGGATATACGGC 
TACATGTGGATCCATTGCCGGTGACCTAATGATCTTCGCTGTGGTCCTGCAGGTCATC 

1 5 ATGCACTACGAAAGACTGGCCAAGGTTCTTAGGGAGTTTAAGATTCAAGCCCATAACG 
CACCCAATGGAGCTAAGGAGGATATAAGGAAGTTGCAGTCCCTAGTCGCCAATCACAT 
TGATATACTTCGACTCACTGATCTGATGAACGAGGTCTTTGGAATTCCCTTGTTGCTA 
AACTTTATTGCATCTGCGCTGCTGGTCTGCCTGGTGGGAGTTCAATTAACCATCGCTT 
TAAGTCCAGAGTATTTTTGCAAGCAGATGCTATTTCTGATTTCCGTACTGCTTGAGGT 

2 0 CTATCTCCTTTGCTCCTTCAGCCAGAGGTTAATAGATGCTGTATGT 

DORS7 

MTI EDI GLVGINVRMWRHLAVLYPTPGS SWRKFAFVLPVTAMNLMQFVYLLRMWGDLP 
AF I LNMF F F S AI FNALMRTWLV 1 1 KRRQFEE FLGQLATLFH S I LD S TDEWGRG I LRRA 

2 5 EREARNLAILNLSASFLDIVGALVSPLFREERAHPFGVALPGVSMTSSPVYEVIYIAQ 

LPTPLLLSMMYMPFVSLFAGLAI FGKAMLQ I LVHRLGQI GGEEQSEEERFQRLAS CI A 
YHTQVMRYVWQLNKLVANI VAVEAI I FGSI I CSLLFCLNI ITS PTQVI S I VMYILTML 
YVLFTYYNRANEI CLEHNRVAEAVYNVPWYEAGTRFRKTLL I FLMQTQHPME I RVGNV 
YPMTLAMFQSLLNASYSYFTMLRGVTGK 

30 

DOR87nt 

GGCACGAGGCTTATAGAAAGTGCCGAGCAATGACAATCGAGGATATCGGCCTGGTGGG 
CATCAACGTGCGGATGTGGCGACACTTGGCCGTGCTGTACCCCACTCCGGGCTCCAGC 
TGGCGCAAGTTCGCCTTCGTGCTGCCGGTGACTGCGATGAATCTGATGCAGTTCGTCT 

3 5 ACCTGCTGCGGATGTGGGGCGACCTGCCCGCCTTCATTCTGAACATGTTCTTCTTCTC 

GGCCATTTTCAACGCCCTGATGCGCACGTGGCTGGTCATAATCAAGCGGCGCCAGTTC 
GAGGAGTTTCTCGGCCAACTGGCCACTCTGTTCCATTCGATTCTCGACTCCA.CCGACG 



AGTGGGGGCGTGGCATCCTGCGGAGGGCGGAACGGGAGGCTCGGAACCTGGCCATCCT 
TAATTTGAGTGCCTCCTTCCTGGACATTGTCGGTGCTCTGGTATCGCCGCTTTTCAGG 
GAGGAGAGAGCTCATCCCTTCGGCGTAGCTCTACCAGGAGTGAGCATGACCAGTTCAC 
CCGTCTACGAGGTTATCTACTTGGCCCAACTGCCTACGCCCCTGCTGCTGTCCATGAT 
GTACATGCCTTTCGTCAGCCTTTTTGCCGGCCTGGCCATCTTTGGGAAGGCCATGCTG 
CAGATCCTGGTACACAGGCTGGGCCAGATTGGCGGAGAAGAGCAGTCGGAGGAGGAGC 
GCTTCCAAAGGCTGGCCTCCTGCATTGCGTACCACACGCAGGTGATGCGCTATGTGTG 
GCAGCTCAACAAACTGGTGGCCAACATTGTGGCGGTGGAAGCAATTATTTTTGGCTCG 
ATAATCTGCTCACTGCTCTTCTGTCTGAATATTATAACCTCACCCACCCAGGTGATCT 
CGATAGTGATGTACATTCTGACCATGCTGTACGTTCTCTTCACCTACTACAATCGGGC 
CAATGAAATATGCCTCGAGAACAACCGGGTGGCGGAGGCTGTTTACAATGTGCCCTGG 
TACGAGGCAGGAACTCGGTTTCGCAAAACCCTCCTGATCTTCTTGATGCAAACACAAC 
ACCCGATGGAGATAAGAGTCGGCAACGTTTACCCCATGACATTGGCCATGTTCCAGAG 
TCTGTTGAATGCGTCCTACTCCTACTTTACCATGCTGCGTGGCGTCACCGGCAAATGA 
GCTGAAAGACCGAAAAAACCGGAGTATCCCCTTCCATATTCCCCCTGCTCCTTTATTT 
TCCTTTCCTTTTCCCTTTCCGTTTTCCCATTCGCTTTTCCAGCAATCCGGGTAATGCA 
AAAAGTTGTTGCTGGCTGTGGTCCTGGCTGCTTGTTTGGCATTTGCATATGCTTGTCG 
TTTGAAAGGATTTAATCGGACTGCTGGCACGGAGTCGGCATCCTGGCTCCTGGATCCT 
GGCATGCAAATAGTTGGCTTCTTAGATTGTTACACAAAATAGATTGTAGATTGCAGCT 
GAATGTTGTGCTTGGAATAAAGTCAAAAGGATGTGGAGTCGGCCCAAGGCTCTGCCCA 
TTCTGTTTGCTCGGGATGCCCGAAAGTATGAAAAAAAAAAAAAAAAAA 

DOR91 

jy^VRYVPRFADGQKVKLAWPLAVFRLNHIFWPLDPSTGKWGRYLDK^ 
NDAELRYLRFEASNRNLDAFLTGMPTYLI LVEAQFRSLHI LLHFEKLQKFLE I FYANI 
YIDPRKEPEMFRKVDGKMI INRLVSAMYGAVI SLYLI APVFS I INQSKDFLYSMIFPF 
D SDPLYI FVPLLLTNVWVG I VI DTMMFGETNLLCELI VHLNGS YMLLKRDLQLAI EKI 
LVARDRPHMAKQLKVL I TKTL.RKNVALNQFGQQLEAQYTVRVF IMFAFAAGLLCALS F 
KAYTTDSLSTMYYLTHWEQILQYSTNPSENLRLLKLINLAIEMNSKPFYVTGLKYFRV 
S LQAGLKRQKFLRSAS SSTLS TADVLAFAFAFTRWLL 

r>OR91nt 

ATGGTTCGTTACGTGCCCCGGTTCGCTGATGGTCAGAAAGTAAAGTTGGCTTGGCCCT 
TGGCGGTTTTTCGGTTAAATCACATATTCTGGCCATTGGATCCGAGCACAGGGAAATG 
GGGCCGATATCTGGACAAGGTTCTAGCTGTTGCGATGTCCTTGGTTTTTATGCAACAC 
AACGATGCAGAGCTGAGGTACTTGCGCTTCGAGGCAAGTAATCGGAATTTGGATGCCT 
TTCTCACAGGAATGCCAACGTATTTAATCCTCGTGGAGGCTCAATTTAGAAGTCTTCA 
CATTCTACTGCACTTCGAGAAGCTTCAGAAGTTTTTAGAAATATTCTACGCAAATATT 



TATATTGATCCCCGTAAGGAACCCGAAATGTTTCGAAAAGTGGATGGAAAGATGATAA 
TTAACAGATTAGTTTCGGCCATGTACGGTGCAGTTATCTCTCTGTATCTAATCGCACC 
CGTTTTTTCCATCATTAACCAAAGCAAAGATTTTCTATACTCTATGATCTTTCCGTTC 
GATTCGGATCCCTTGTACATATTTGTGCCACTGCTTTTGACAAACGTATGGGTTGGCA 
TTGTAATAGATACCATGATGTTCGGGGAGACGAATTTGTTGTGTGAACTAATTGTCCA 
CCTAAATGGTAGTTATATGTTGCTCAAGAGGGACTTGCAGTTGGCCATTGAAAAGATA 
TTAGTTGCAAGGGACCGTCCGCATATGGCCAAACAGCTAAAGGTTTTAATTACAAAAA 
CTCTCCGAAAGAATGTGGCTCTAAATCAGTTTGGCCAGCAGCTGGAGGCTCAGTATAC 
TGTGCGGGTTTTTATTATGTTTGCATTCGCTGCGGGCCTTTTATGTGCTCTTTCTTTT 
AAGGCTTATACGACGGATTCCCTCAGCACAATGTACTACCTTACCCATTGGGAGCAAA 
TCCTGCAGTACTCTACAAATCCCAGCGAAAATCTGCGATTACTAAAGCTCATTAACTT 
GGCCATTGAGATGAACAGCAAGCCCTTCTATGTGACAGGGCTAAAATATTTTCGCGTT 
AGTCTGCAGGCTGGCTTAAAACGTCAAAAGTTTCTGCGGTCTGCCAGCTCATCCACCC 
TTAGCACCGCTGATGTGTTGGCATTTGCTTTTGCTTTTACTCGCTGGCTGCTT 

DOR92 

MSEWLRFLKRDQQLDVYFFAVPRLSLDIMGYWPGKTGDTWPWRSLIHFAILAIGVATE 
LHAGMC FLDRQQ I TLALETLCPAGTSAVTLLKMFLMLRFRQDLS I MWNRLRGLLFDPN 
WERPEQRD I RLKHSAMAARINFWPLS AGFFTCTTYNLKP I L I AMI LYLQNRYEDFVWF 
TP FNMTMPKVLLNYPFFPLTYI F I AYTGYVTI FMFGGCDGFYFEFCAHLSALFEVLQA 
E I ESMFRP YTDHLELS PVQLYI LEQKMRS VI IRHNAI IDLTRFFRDRYTI ITLAHFVS 
AA^^VIGFS^TVNLLTLGNNGLGAMLYVAYTVAAIJSQLLVYCYGGTLVAESSTGLCRAMF 
SCPWQLFKPKQRRLVQLLILRSQRPVSMAVPFFSPSLATFAAILQTSGSIIALVKSFQ 

DOR92nt 

ATGTCCGAGTGGTTACGCTTTCTGAAACGCGATCAACAGCTGGATGTGTACTTTTTTG 
CAGTGCCCCGCTTGAGTTTAGACATAATGGGCTATTGGCCGGGCAAAACTGGTGATAC 
ATGGCCCTGGAGATCCCTGATTCACTTCGCAATCCTGGCCATTGGCGTGGCCACCGAA 
CTGCATGCTGGCATGTGTTTTCTAGACCGACAGCAGATTACCTTGGCACTGGAGACCC 
TCTGTCCA.GCTGGCACATCGGCGGTCACGCTGCTCAAGATGTTCCTAATGCTGCGCTT 
TCGTCAGGATCTCTCCATTATGTGGAACCGCCTGAGGGGCCTGCTCTTCGATCCCAAC 
TGGGAGCGACCCGAGCAGCGGGACATCCGGCTAAAGCACTCGGCCATGGCGGCTCGCA 
TCAATTTCTGGCCCCTGTCAGCCGGATTCTTCACATGCACCACCTACAACCTAAAGCC 
GATACTGATCGCAATGATATTGTATCTCCAGAATCGTTACGAGGACTTCGTTTGGTTT 
ACACCCTTCAATATGACTATGCCCAAAGTTCTGCTAAACTATCCATTTTTTCCCCTGA 
CCTACATATTTATTGCCTATACGGGCTATGTGACCATCTTTATGTTCGGCGGCTGTGA 
TGGTTTTTATTTCGAGTTCTGTGCCCACCTATCAGCTCTTTTCGAAGTGCTCCAGGCG 
GAGATAGAATCAATGTTTAGAGCCTACACTGATCACTTGGAACTGTCGCCAGTGCAGC 
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TTTACATTTTAGAGCAAAAGATGCGATCAGTAATCATTAGGCACAATGCCATCATCGA 
TTTGACCAGATTTTTTCGTGATCGCTATACCATTATTACCCTGGCCCATTTTGTGTCC 
GCCGCCATGGTGATTGGATTCAGCATGGTTAATCTCCTGACATTGGGCAATAATGGTC 
TGGGCGCAATGCTCTATGTGGCCTACACGGTTGCCGCTTTGAGCCAACTGCTGGTTTA 
5 TTGCTATGGCGGAACTCTGGTGGCCGAAAGTAGCACTGGTCTGTGCCGAGCCATGTTC 
TCCTGTCCGTGGCAGCTTTTTAAGCCTAAACAACGTCGACTCGTTCAGCTTTTGATTC 
TCAGATCGCAGCGTCCTGTTTCCATGGCAGTGCCATTCTTTTCGCCATCGTTGGCTAC 
CTTTGCTGCGATTCTTCAAACTTCGGGTTCCATAATTGCGCTGGTTAAGTCCTTTCAG 

10 DOR95 

MSDKVKGKKQEEKDQSLRVQILVYRCMGIDLWSPTMANDRPWLTFVTMGPLFLFMVPM 
F1AAHEYITQVSLLSDTLGSTFASMLTLVKFLLFCYHRKEFVGLIYHIRAIIAKEIEV 
WPDAREI IEVENQSDQMLSLTYTRCFGLAGIFAALKPFVGI ILSSIRGDEIHLELPHN 
GVYPYDLQVVMFyVPTYLWNVMASYSAVTMALCVDSLLFFFTYNVCAI FKIAKHRMIH 
1 5 LPAVGGKEELEGLVQVLLLHQKGIjQI ADHI ADKYHPLI FLQFFLS ALiQ I CFI GFQVAD 

LFPNPQSLYFIAFVGSLLIAiFIYSKCGENIKSASLDFGNGLYETNWTDFSPPTKRAL 
L I AAMRAQRP CQMKG YFFEAS MATFST I VRS AVS YIMMLiRS FNA 

DOR9Snt: 

2 0 ATGAGCGACAAGGTGAAGGGAAAAAAGCAGGAGGAAAAGGATCAATCCTTGCGGGTG CAAATTC 
CCAGCTATAGTGCTGTAACCATGGCACTCTGCGTGGACTCGCTGCTCTTCTTTTTCAC 
CTACAACGTGTGCGCCATTTTCAAGATCGCCAAGCACCGGATGATCCATCTGCCGGCG 
GTGGGCGGAAAGGAGGAGCTGGAGGGGCTCGTCCAGGTGCTGCTGCTGCACCAGAAGG 
GCCTCCAGATCGCCGATCACATTGCGGACAAGTACCGGCCGCTGATCTTTTTGCAGTT 

2 5 CTTTCTGTCCGCCTTGCAGATCTGCTTCATTGGATTCCAGGTGGCTGATCTGTTTCCC 

AATCCGCAGAGTCTCTACTTTATCGCCTTTGTGGGCTCGCTGCTCATCGCACTGTTCA 
TCTACTCGAAGTGCGGCGAAAATATCAAGAGTGCCAGCCTGGATTTCGGAAACGGGCT 
GTACGAGACCAACTGGACCGACTTCTCGCCACCCACTAAAAGAGCCCTCCTCATTGCC 
GCCATGCGCGCCCAGCGACCTTGCCAGATGAAGGGCTACTTTTTCGAGGCCAGCATGG 

3 0 CCACCTTCTCGACGATTGTTCGCTCTGCCGTGTCGTACATCATGATGTTGCGCTCCTT 

TAATGCC 

DOR99 

MEEFLRPQMFQEVAQMVHFQWRRNP VDNSMVNASMVPFCLS AFLNVLFFGCNGWDI I G 
35 HFWLGHPANONPPVLSITIYFSIRGLMLYLKPJCEIVEFVNDLDRECPRDLVSQLDMQM 
DETYRNFWQRYRF I RI YSHl^PMFCVVPIALFLLTHEGKDTP VAQHEQLLGGWLPCG 
VRKDPNFYLLWSFDI^CTTCGVSFF^FDNLFNVMQGHLVMHI^HLARQFSAIDPRQ 
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SLTDEKRFFVDLRLLVQRQQLLNGLCRKYNDIFKVAFLVSNFVGAGSLCFYLFMLSET 
S D VL 1 1 AQY I Ij PTLVLVGFTFE I CLRGTQLE KAS EGLE S S LRS QEWYLG S RRYRKF YL 
LWTQYCQRTQQLGAFGLIQVNMVHFTEIMQLAYRLFTFLKSH 

5 DOR99nt 

ATGGAGGAGTTTCTGCGTCCGCAGATGTTCCAGGAGGTGGCTCAGATGGTGCATTTCC 
AGTGGCGGAGAAATCCGGTGGACAACAGCATGGTGAACGCATCCATGGTCCCCTTCTG 
CTTGTCGGCGTTTCTTAATGTCCTGTTTTTCGGCTGCAATGGTTGGGACATCATAGGA 
CATTTTTGGCTGGGACATCCTGCCAACCAGAATCCGCCCGTGCTTAGCATCACCATTT 

1 0 ACTTCTCGATCAGGGGATTGATGCTATACCTGAAACGAAAGGAAATCGTTGAGTTTGT 
TAACGACTTGGATCGGGAGTGTCCGCGGGACTTGGTCAGCCAGTTGGACATGCAAATG 
GATGAGACGTACCGAAACTTTTGGCAGCGCTATCGCTTCATCCGTATCTACTCCCATT 
TGGGTGGTCCGATGTTCTGCGTTGTGCCATTAGCTCTATTCCTCCTGACCCACGAGGG 
TAAAGATACTCCTGTTGCCCAGCACGAGCAGCTCCTTGGAGGATGGCTGCCATGCGGT 

1 5 GTGCGAAAGGACCCAAATTTCTACCTTTTAGTCTGGTCCTTCGACCTGATGTGCACCA 
CTTGCGGCGTCTCCTTTTTCGTTACCTTCGACAACCTATTCAATGTGATGCAGGGACA 
TTTGGTCATGCATTTGGGCCATCTTGCTCGCCAGTTTTCGGCCATCGATCCTCGACAG 
AGTTTGACCGATGAGAAGCGATTCTTTGTGGATCTTAGGTTATTAGTTCAGAGGCAGC 
AGCTTCTTAATGGATTGTGCAGAAAATACAACGACATCTTTAAAGTGGCCTTCCTGGT 

2 0 GAGCAATTTTGTAGGCGCCGGTTCCCTCTGCTTCTACCTCTTTATGCTCTCGGAGACA 
TCAGATGTCCTTATCATCGCCCAGTATATATTACCCACTTTGGTCCTGGTGGGCTTCA 
CATTTGAGATTTGTCTACGGGGAACCCAACTGGAAAAGGCGTCGGAGGGACTGGAATC 
GTCGTTGCGAAGCCAGGAATGGTATTTGGGAAGTAGGCGGTACCGGAAGTTCTATTTG 
CTCTGGACGCAATATTGCCAGCGAACACAGCAACTGGGCGCCTTTGGGCTAATCCAAG 

2 5 TCAATATGGTGCACTTCACTGAAATAATGCAGCTGGCCTATAGACTCTTCACTTTTCT 
CAAATCTCAT 



MTTSMQP S KYTGLVADLMPNI RAMKYS GLFMHNFTGGS AFMKKVYS S VHLVFLLMQFT 
3 0 FILVl^IJJAEEVNELSGNTITTLFFTHCITKFIYLAVNQKNFYRTLNIWNQVNTHPL 
FAESDARYHS I ALAKMRKLFFLVMLTTVASATAWTTI TFFGDSVKMWDHETNS SIPV 
E I PRLP I KS FYP WNASHGMFYMI SFAFQI YYVLFSMI HSNLCDVMFCSWLIFACEQLQ 
HLKGIMKPLMELSASLDTYRPNSAALFRSLSANSKSELIHNEEKDPGTDMDMSGIYSS 
KADWGAQFRAP STLQS FGGNGGGGNGL VNG AN PNGLTKKQEMMVRS AI KYWVERHKHV 
3 5 VRLVAAI GDTYGAALLLHMLTSTI KLTLLAYQATKINGVNVYAFTWGYLGYALAQVF 
HFC IFGNRL.IEES S SVMEAAYSCHWYDGSEEAKTFVQI VCQQCQKAMS I SGAKFFTVS 
LDLFASVLGAWTYFMVLVQLK 
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nm;ft.45nt 

GGCACGAGCTGGTTCCGGAAAGCCTCATATCTCGTATCTTAAAGTATCCCGGTTAAGC 
CTTAAAGAGTGAAATGATTGCCTAGACGATTGCTGCATTACTGGCACTCAATTAACCC 
AAGTGTACCAGACAACAATTACATTTGTATTTTTAAAGTTCAATAGCAAGGATGACAA 
CCTCGATGCAGCCGAGCAAGTACACGGGCCTGGTCGCCGACCTGATGCCCAACATCCG 
GGCGATGAAGTACTCCGGCCTGTTCATGCACAACTTCACGGGCGGCAGTGCCTTCATG 
AAGAAGGTGTACTCCTCCGTGCACCTGGTGTTCCTCCTCATGCAGTTCACCTTCATCC 
TGGTCAACATGGCCCTGAACGCCGAGGAGGTCAACGAGCTGTCGGGCAACACGATCAC 
GACCCTCTTCTTCACCCACTGCATCACGAAGTTTATCTACCTGGCTGTTAACCAGAAG 
AATTTCTACAGAACATTGAATATATGGAACCAGGTGAACACGCATCCCTTGTTCGCCG 
AGTCGGATGCTCGTTACCATTCGATCGCACTGGCGAAGATGAGGAAGCTGTTCTTTCT 
GGTGATGCTGACCACAGTCGCCTCGGCCACCGCCTGGACCACGATCACCTTCTTTGGC 
GACAGCGTAAAAATGGTGGTGGACCATGAGACGAACTCCAGCATCCCGGTGGAGATAC 
CCCGGCTGCCGATTAAGTCCTTCTACCCGTGGAACGCCAGCCACGGCATGTTCTACAT 
GATCAGCTTTGCCTTTCAGATCTACTACGTGCTCTTCTCGATGATCCACTCCAATCTA 
TGCGACGTGATGTTCTGCTCTTGGCTGATATTCGCCTGCGAGCAGCTGCAGCACTTGA 
AGGGCATCATGAAGCCGCTGATGGAGCTGTCCGCCTCGCTGGACACCTACAGGCCCAA 
CTCGGCGGCCCTCTTCAGGTCCCTGTCGGCCAACTCCAAGTCGGAGCTAATTCATAAT 
GAAGAAAAGGATCCCGGCACCGACATGGACATGTCGGGCATCTACAGCTCGAAAGCGG 
ATTGGGGCGCTCAGTTTCGAGCACCCTCGACACTGCAGTCCTTTGGCGGGAACGGGGG 
CGGAGGCAACGGGTTGGTGAACGGCGCTAATCCCAACGGGCTGACCAAAAAGCAGGAG 
ATGATGGTGCGCAGTGCCATCAAGTACTGGGTCGAGCGGCACAAGCACGTGGTGCGAC 
TGGTGGCTGCCATCGGCGATACTTACGGAGCCGCCCTCCTCCTCCACATGCTGACCTC 
GACCATCAAGCTGACCCTGCTGGCATACCAGGCCACCAAAATCAACGGAGTGAATGTC 
TACGCCTTCACAGTCGTCGGATACCTAGGATACGCGCTGGCCCAGGTGTTCCACTTTT 
GCATCTTTGGCAATCGTCTGATTGAAGAGAGTTCATCCGTCATGGAGGCCGCCTACTC 
GTGCCACTGGTACGATGGCTCCGAGGAGGCCAAGACCTTCGTCCAGATCGTGTGCCAG 
CAGTGCCAGAAGGCGATGAGCATATCGGGAGCGAAATTCTTCACCGTCTCCCTGGATT 
TGTTTGCTTCGGTTCTGGGTGCCGTCGTCACCTACTTTATGGTGCTGGTGCAGCTCAA 
GTAAGTTGCTGCGAAGCTGATGGATTTTTGTACCAGAAAAGCGAATGCCAAGAAGCCA 
CCTACCGCCCCTTGCCCCCTCCGCACTGTGCAACCAGCAATATCACAGAGCAATTATA 
ACGCAAATTATATATTTTATACCTGCGACGAGCGAGCCTCGTGGGGCATAATGGAGAC 
ATTCTGGGGCACATAGAAGCCTGCAAATACTTATCGATTTTGTACACGCGTAGAGCTT 
TTAATGTAAACTCAAGATGCAAACTAAATAAATGTGTAGTGAAAAAAAAAAAAAAAAA 



AAA 
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ftENBANK ACCESSION NUMBERS 

The accession numbers for the sequences reported in this 
paper are AF127921-AF127926 . 
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